Article: Linux 2.6: A Breakthrough for Embedded Systems
Sep 8, 2003 — by LinuxDevices Staff — from the LinuxDevices Archive — 2 viewsLinux 2.6 introduces many new features that make it an excellent operating system for embedded computing. Among these new features are enhanced real-time performance, easier porting to new computers, support for large memory models, support for microcontrollers, and an improved I/O system.
The embedded computing universe includes computers of all sizes, from tiny portable devices, like wristwatch cameras, to systems having thousands of nodes distributed worldwide, as is the case with telecommunications switches. Embedded systems can be simple enough to require only small microcontrollers, or they may use massive parallel processors with prodigious amounts of memory and computing power. Linux 2.6 has been enhanced to provide support across the spectrum of these needs.
On-Time Arrival: Performance and Determinism
Embedded systems often need to reliably meet timing constraints. Although Linux 2.6 is not yet a true real-time operating system, it does have improvements that make it a worthier platform when responsiveness is desirable. Three significant improvements are preemption points in the kernel, an efficient scheduler, and improved synchronization.
Until now, it has been necessary to acquire special patches to get improved responsiveness in Linux. Often this meant buying a special Linux implementation from a vendor whose patches improved interrupt performance and scheduling latency. Now 2.6 contains these improvements in the mainstream kernel, so it is not necessary to get a special configuration.
Linux 2.6 provides several features that help overall responsiveness. Two changes are worth noting. First, the operating system now uses a preemptible kernel. Second, the algorithm used for scheduling has been made more efficient.
No Delays: A Preemptible Kernel
Like most general-purpose operating systems, Linux has always forbidden the process scheduler from running when a process is executing in a system call. This has meant that once a task is in a system call, that task will control the processor until the system call returns, no matter how long that might take. This design is simple to implement, but may cause more important tasks to be delayed while waiting for the system call to complete.
The kernel is now preemptible to some degree. Linux 2.6 is more responsive than 2.4 and gives implementers better control over the timing of events. It is not a true RTOS, but feels less “jumpy” than previous kernels. In Linux 2.6, kernel code has been salted with “preemption points,” instructions that allow the scheduler to run and possibly block the current process so as to schedule a higher priority process. Linux 2.6 avoids unreasonable delays in system calls by periodically testing a preemption point. During these tests, the calling process may block and let another process run.
Software that has to meet deadlines is incompatible with virtual memory demand paging, in which slow handling of page faults would ruin responsiveness. The 2.6 kernel can be built with no virtual memory system to eliminate this problem. Of course, it becomes the software designer's responsibility to ensure that there will always be enough real memory to get the job done.
The following two graphs provide a comparison between the average and worst-case interrupt and task response times for Linux kernel 2.4.1 (based on BlueCat 4.1) and Linux kernel 2.6 Beta (based on BlueCat 5.0 beta). The data represents 3.1 million samples, on a 1GHz Pentium III processor. The measurements were made using LynuxWorks' real-time tests with five or more interrupting devices. The system was under a heavy load consisting of continuous disk transfers (tar/sync), network traffic (ping flood), console input, graphics activity, and a timer card.
Getting the Job Done: An Efficient Scheduler
The process scheduler has been rewritten to eliminate the slow algorithms of previous versions. Formerly, in order to decide which task should run next, the scheduler had to look at each ready task and make a computation to determine that task's relative importance. After all computations were made, the task with the highest score would be chosen. Since the time required for this algorithm varied with the number of tasks, complex multitasking applications would suffer from slow scheduling.
The scheduler in Linux 2.6 no longer scans all tasks each time. Instead, when a task becomes ready to run it is sorted into position on a queue called the current queue. Then, when the scheduler runs, it only has to choose the task at the most favorable position on the queue. Thus, scheduling is done in a constant amount of time. When the task is running, it is given a time slice, or period of time it may use the processor before it has to give way to another thread. When its time slice has expired, the task is moved to another queue, called the expired queue. The task is sorted onto this expired queue according to its priority.
At some point, all of the tasks on the current queue will have executed and been moved to the expired queue. When this happens, the queues are switched: the expired queue becomes the current queue, and the empty current queue becomes the expired queue. Since the tasks on the new current queue have already been sorted, the scheduler can once again resume its simple algorithm of selecting the first task from the current queue. This new procedure is substantially faster than the old one, and works just as well whether there are many tasks or only a few.
Know When to Say When: New Synchronization Primitives
Multiprocessing applications sometimes need to share resources, such as shared memory or shared devices. In order to avoid race conditions, programmers use a function called a mutex to ensure that only one task is using the resource at a time. The Linux mutex implementation up to now has always involved a system call to the kernel to decide whether to block the thread or allow it to continue executing. But when in fact the decision is to continue, the time-consuming system call was unnecessary. The new implementation in Linux 2.6 supports Fast User-Space Mutexes. These functions can check from user space whether blocking is necessary, and only perform the system call to block the thread when it is required. When blocking is not required, avoiding the unneeded system call saves time. These functions also use scheduling priority to decide which thread gets to execute when there is contention.
Can't We All Get Along? — Sharing Memory with Less Contention
Embedded systems are sometimes large systems of processors, as in telecom networks or mass storage systems. Multiple processors share memory, either as symmetric or loosely coupled multiprocessors. Symmetric multiprocessing designs are limited by organizing all of the memory as equally accessible to all processors, making competition for memory the limiting factor in processing efficiency. Linux 2.6 provides a different way of achieving multiprocessing, called Non Uniform Memory Access, or NUMA. In NUMA, memory and processors are interconnected, but for each processor, some memory is “close” to that processor and other memory is “farther away.” This means that when memory contention occurs, the nearer processor has superior rights to the memory. The 2.6 kernel provides a set of functions that are aware of the memory/processor topology. The scheduler is able to use this information to favor tasks when they are using local memory. This reduces the memory contention bottleneck, and enhances throughput.
A Stitch in Time: POSIX Threads, Signals, and Timers
The POSIX standard describes a set of functions for thread creation and management called POSIX threads, or pthreads. This well-defined system of functionality has been available in past versions of Linux, but its implementation has been much improved in 2.6. The Native POSIX Thread Library (NPTL) has been shown to be a significant improvement over the older LinuxThreads approach, and even beats other high-performance alternatives that have been available as patches.
Along with POSIX threads, 2.6 provides POSIX signals and POSIX high-resolution timers as part of the mainstream kernel. POSIX signals are an improvement over UNIX-style signals, which were the default in previous Linux releases. Unlike UNIX signals, POSIX signals cannot be lost, and can carry information as an argument. Also, POSIX signals can be sent from one POSIX thread to another, rather than only from process to process like UNIX signals.
Embedded systems often need to poll hardware or do other tasks on a fixed schedule. POSIX timers make it easy to arrange any task to get scheduled periodically. The clock that the timer uses can be set to tick at a rate a fine as one kilohertz, so that software engineers can control the scheduling of tasks with precision.
Think Different: Support for Custom Designs
Hardware designs in the embedded world are often customized for special applications. It is common for designers to need to solve a design issue in an original way. For example, a purpose-built board may use different IRQ management than a similar reference design uses. In order to run on the new board, Linux has to be ported, or altered to support the new hardware. This porting will be easier if the operating system is made of components that are well separated, so that it is only necessary to change the code that needs to change. The components of Linux 2.6 that are likely to be altered for a custom design have been refactored with a concept called Subarchitecture. Components are clearly separate and can be individually modified or replaced with minimal impact on other components of the board support package.
Having it All: Devices, Busses, and I/O
Linux is becoming the first choice among operating systems in the consumer market. Linux 2.6 includes the Advanced Linux Sound Architecture, or ALSA. This state-of-the-art facility supports USB and MIDI devices with fully thread-safe, multi-processor-safe software. With ALSA, a system can run multiple sound cards and do such things as play and record at the same time, or mix multiple audio streams.
Video4Linux, the system for supporting video, is all new in Linux 2.6. Although it is not backward compatible with previous video paradigm, it is intended for the latest stable of radio and TV tuners, video cameras, and other multimedia.
USB 2.0, which is about forty times faster than conventional USB, makes its debut on Linux 2.6. We can expect that high-speed devices will proliferate in the near future, and Linux will be a leading platform for USB 2.0 products.
Just Say No: No Keyboard, No Monitor, No Wires
Deeply embedded systems have no user interface, and sometimes no operator interface. On previous versions of Linux, it was possible to make a headless system, but some of the support software was not removable, giving the kernel more bulk than necessary. Linux 2.6 can be configured to entirely omit support for displays, mice, or keyboards.
For portable products, the Bluetooth wireless protocol debuts in Linux 2.6, taking its place along with 802.11. Both the SCO datalink for audio and the L2CAP for connection-oriented data transfers are available, making Linux a natural choice wherever no-fuss connectivity is a key requirement.
Go Big: Linux on 64-bit Machines
On the other end of the spectrum are computers that provide exceptionally large resources, such as very large memory sizes or high-throughput multiprocessors. These heavyweights have numerous embedded applications, such as mass data storage systems and special compute engines.
Embedded Linux developers who need very large memory sizes have their choice of 64-bit microprocessors in Linux 2.6. The Intel Itanium 64 architecture was treated in previous releases of Linux, and support continues in 2.6. Linux 2.6 continues to cover the AMD64 architecture with support of the AMD Opteron microprocessor. The PowerPC is not left out, with ppc64 support also available. The Linux community has the momentum to keep up with innovations in large-bus, large-memory computing.
Keeping it Simple: Linux on Microcontrollers
Microcontrollers are now supported on the mainstream Linux 2.6 kernel. In most cases, previous instances of Linux required a full-featured microprocessor that had a memory management unit. In the embedded marketplace, simpler microcontrollers are often the appropriate choice when low cost and simplicity are called for.
There have been ways to put Linux on MMU-less processors prior to version 2.6. The Linux for Microcontrollers project has been a successful branch of Linux for some important small systems. This branch, designated uClinux, has been the focus of small-processor developers. Version 2.6 integrates a significant portion of uClinux into the production kernel, bringing microcontroller support into the Linux mainstream.
The 2.6 version of Linux supports several current microcontrollers that don't have memory management units. Linux 2.6 supports Motorola m68k processors such as Dragonball and ColdFire, as well as Hitachi H8/300 and NEC v850. Also supported is the ETRAX family of networking microcontrollers by Axis Communications. Linux running on MMU-less processors will still be multitasking, but will not have the memory protection provided on the fully endowed processors. To be consistent with the lack of true processes on these small platforms, there is little in the way of security.
Having it Both Ways: Large Memory on 32-bit Processors
Sometimes, an embedded system may want to use a conventional Intel-architecture microprocessor, but may need to manage more RAM that can be addressed in the usual 32-bit address space. Intel has introduced a concept called Physical Address Extension (PAE) that makes it possible for 32-bit computers to access up to 64 Gigabytes of memory through page frames. With PAE, Linux 2.6 systems running on newer x86 machines view memory through a movable “window,” allowing the system to address up to 64 GB of RAM. Linux 2.6 support of PAE would be especially useful on systems that perform store-and-forward service for images, video, and similar applications where large datasets must be handled quickly.
Linux is easily the fastest growing operating systems in the embedded world. Its low cost, abundant features, and openness make Linux a fertile ground for creativity. As its importance grows, we can expect Linux to become the platform where future progress happens first. Version 2.6 is a great stride in the right direction.
About the author: Brandon White is director of customer education at LynuxWorks, a leading producer of operating systems for the embedded marketplace for fifteen years. The company's product, BlueCat Linux, is an enhanced implementation of Linux that is used in a wide range of embedded systems.
This article was originally published on LinuxDevices.com and has been donated to the open source community by QuinStreet Inc. Please visit LinuxToday.com for up-to-date news and articles about Linux and open source.