Adeos nanokernel for Linux kernel
Jun 3, 2002 — by LinuxDevices Staff — from the LinuxDevices Archive — 25 views[Note: This article appeared initially on the linux-kernel mailing list (LKML)]
We have released the initial implementation of the Adeos nanokernel. The following is a complete description of its background, its implementation, its API, and its potential uses. Please also see the press release and the project's workspace. The Adeos code is distributed under the GNU GPL.
The Adeos nanokernel is based on research and publications made in the early '90s on the subject of nanokernels. Our basic method was to reverse the approach described in most of the papers on the subject. Instead of first building the nanokernel and then building the client OSes, we started from a live and known-to-be-functional OS, Linux, and inserted a nanokernel beneath it. Starting from Adeos, other client OSes can now be put side-by-side with the Linux kernel.
To this end, Adeos enables multiple domains to exist simultaneously on the same hardware. None of these domains see each other, but all of them see Adeos. A domain is most probably a complete OS, but there is no assumption being made regarding the sophistication of what's in a domain.
To share the hardware among the different OSes, Adeos implements an interrupt pipeline (ipipe). Every OS domain has an entry in the ipipe. Each interrupt that comes in the ipipe is passed on to every domain in the ipipe. Instead of disabling/enabling interrupts, each domain in the pipeline only needs to stall/unstall his pipeline stage. If an ipipe stage is stalled, then the interrupts do not progress in the ipipe until that stage has been unstalled. Each stage of the ipipe can, of course, decide to do a number of things with an interrupt. Among other things, it can decide that it's the last recipient of the interrupt. In that case, the ipipe does not propagate the interrupt to the rest of the domains in the ipipe.
Regardless of the operations being done in the ipipe, the Adeos code does __not__ play with the interrupt masks. The only case where the hardware masks are altered is during the addition/removal of a domain from the ipipe. This also means that no OS is allowed to use the real hardware cli/sti. But this is OK, since the stall/unstall calls achieve the same functionality.
Our approach is based on the following papers (links to these papers are provided at the bottom of this message):
- [1] D. Probert, J. Bruno, and M. Karzaorman. “Space: a new approach to operating system abstraction.” In: International Workshop on Object Orientation in Operating Systems, pages 133-137, October 1991.
[2] D. Probert, J. Bruno. “Building fundamentally extensible application- specific operating systems in Space”, March 1995.
[3] D. Cheriton, K. Duda. “A caching model of operating system kernel functionality”. In: Proc. Symp. on Operating Systems Design and Implementation, pages 179-194, Monterey CA (USA), 1994.
[4] D. Engler, M. Kaashoek, and J. O'Toole Jr. “Exokernel: an operating system architecture for application-specific resource management”, December 1995.
The complete Adeos approach has been thoroughly documented in a whitepaper published more than a year ago entitled “Adaptive Domain Environment for Operating Systems” and available here. The current implementation is slightly different. Mainly, we do not implement the functionality to move Linux out of ring 0. Although of interest, this approach is not very portable.
Instead, our patch taps right into Linux's main source of control over the hardware, the interrupt dispatching code, and inserts an interrupt pipeline which can then serve all the nanokernel's clients, including Linux.
This is not a novelty in itself. Other OSes have been modified in such a way for a wide range of purposes. One of the most interesting examples is described by Stodolsky, Chen, and Bershad in a paper entitled “Fast Interrupt Priority Management in Operating System Kernels” published in 1993 as part of the Usenix Microkernels and Other Kernel Architectures Symposium. In that case, cli/sti were replaced by virtual cli/sti which did not modify the real interrupt mask in any way. Instead, interrupts were defered and delivered to the OS upon a call to the virtualized sti.
Mainly, this resulted in increased performance for the OS. Although we haven't done any measurements on Linux's interrupt handling performance with Adeos, our nanokernel includes by definition the code implementing the technique described in the abovementioned Stodolsky paper, which we use to redirect the hardware interrupt flow to the pipeline.
In terms of implementation, the Adeos' code is rather short since we focused on setting the foundations for sharing interrupts between domains. Here are the files we added:
- kernel/adeos.c: Architecture-independent domain code
arch/i386/kernel/adeos.c: Architecture-dependent domain code
arch/i386/kernel/ipipe.c: Interrupt pipeline code
include/asm-i386/adeos.h: Arch-dependent Adeos header
include/linux/adeos.h: Main (arch-independent) Adeos header
We also modified some files to tap into Linux interrupt dispatching (all the modifications are encapsulated in #ifdef CONFIG_ADEOS/#endif):
- kernel/ksyms.c
arch/i386/kernel/apic.c
arch/i386/kernel/entry.S
arch/i386/kernel/i386_ksyms.c
arch/i386/kernel/i8259.c
arch/i386/kernel/irq.c
arch/i386/kernel/smp.c
arch/i386/kernel/time.c
arch/i386/kernel/traps.c
arch/i386/mm/fault.c
include/asm-i386/keyboard.h
include/asm-i386/system.h
- arch/i386/kernel/process.c
Of course, we also added the appropriate makefile modifications and config options so that you can choose to enable/disable Adeos as part of the kernel build configuration.
Here is Adeos' public API:
- int adeos_register_domain(adomain_t *adp, adattr_t *attr);
Register a new domain using the properties defined in the attribute.
void adeos_unregister_domain(adomain_t *adp);
Remove “adp” domain.
void adeos_renice_domain(adomain_t *adp, int newpri);
Change “adp”'s priority in the ipipe.
void adeos_suspend_domain(void);
This domain is done dealing with the current interrupt. This signals the ipipe to provide the interrupt to the next ipipe stage.
void adeos_virtualize_irq(unsigned irq, void (*handler)(void), int (*acknowledge)(unsigned));
Provide a handler and an acknowledgment function for “irq”.
void adeos_control_irq(unsigned irq, unsigned clrmask, unsigned setmask);
Change the current domain's handling mask for irq “irq”. “clrmask” is applied first and then “setmask” is applied.
void adeos_stall_ipipe(void);
Stall the current domain's ipipe stage. Alternative to cli.
void adeos_unstall_ipipe(void);
Unstall the current domain's ipipe stage. Alternative to sti.
void adeos_restore_ipipe(unsigned x);
Restore the ipipe from its saved state. Alternative to __restore_flags() and local_irq_restore(). This is used with the following defines:
#define adeos_test_ipipe()
test_bit(IPIPE_STALL_FLAG,&adp_current->status)
#define adeos_test_and_stall_ipipe()
test_and_set_bit(IPIPE_STALL_FLAG,&adp_current->status)
Which replace __save_flags and local_irq_save(), respectively.
To add your domain to the ipipe, you need to:
- Register your domain with Adeos using adeos_register_domain()
- Call adeos_virtualize_irq() for all the IRQs you wish to be notified about in the ipipe
During runtime, you may change your position in the ipipe using adeos_renice_domain(). You may also stall/unstall the pipeline and change the ipipe's handling of the interrupts according to your needs.
We currently don't support SMP, but we do have APIC support on UP.
Here are some of the possible uses for Adeos (this list is far from complete):
- Much like User-Mode Linux, it should now be possible to have 2 Linux kernels living side-by-side on the same hardware. In contrast to UML, this would not be 2 kernels one ontop of the other, but really side-by-side. Since Linux can be told at boot time to use only one portion of the available RAM, on a 128MB machine this would mean that the first could be made to use the 0-64MB space and the second would use the 64-128MB space. We realize that many modifications are required. Among other things, one of the 2 kernels will not need to conduct hardware initialization. Nevertheless, this possibility should be studied closer.
- It follows from #1 that adding other kernels beside Linux should be feasible. BSD is a prime candidate, but it would also be nice to see what virtualizers such as VMWare and Plex86 could do with Adeos. Proprietary operating systems could potentially also be accomodated.
- All the previous work that has been done on nanokernels should now be easily ported to Linux. Mainly, we would be very interested to hear about extensions to Adeos. Primarily, we have no mechanisms currently enabling multiple domains to share information. The papers mentioned earlier provide such mechanisms, but we'd like to see actual practical examples.
- By incorporating Adeos into the main kernel tree (I know my inbox is probably going to fill up because of this one), kernel debuggers' main problem (tapping into the kernel's interrupts) is solved and it should then be possible to provide patchless kernel debuggers. They would then become loadable kernel modules.
- Drivers who require absolute priority and dislike other kernel portions who use cli/sti can now create a domain of their own and place themselves before Linux in the ipipe. This provides a mechanism for the implementation of systems that can provide guaranteed realtime response. Of course, we are interested in hearing about comments and suggestions you have about Adeos.
Links to papers:
- [1] http://citeseer.nj.nec.com/probert91space.html ftp://ftp.cs.ucsb.edu/pub/papers/space/iwooos91.ps.gz (not working) http://www4.informatik.uni-erlangen.de/~tsthiel/Papers/Space-iwooos91.ps.gz
[2] http://www.cs.ucsb.edu/research/trcs/abstracts/1995-06.shtml http://www4.informatik.uni-erlangen.de/~tsthiel/Papers/Space-trcs95-06.ps.gz
[3] http://citeseer.nj.nec.com/kenneth94caching.html http://guir.cs.berkeley.edu/projects/osprelims/papers/cachmodel-OSkernel.ps.gz
[4] http://citeseer.nj.nec.com/engler95exokernel.html ftp://ftp.cag.lcs.mit.edu/multiscale/exokernel.ps.Z
This article was originally published on LinuxDevices.com and has been donated to the open source community by QuinStreet Inc. Please visit LinuxToday.com for up-to-date news and articles about Linux and open source.