A Tiny OS That Scales Up

This article originally appeared in BYTE Magazine, May 1998

QNX Software Systems Limited (QSSL), created the QNX operating system in the very early 1980s. It started out as a realtime implementation of the UNIX API, and rapidly found its way into applications like medical instrumentation, process control, and point of sale terminals, just to name a few. QNX has always utilized message passing as the main inter-process communications method, and after the first few versions, transparently distributed this concept over a local area network, allowing multiple machines to function as one single logical machine.

In the early 1990s, a new version, QNX 4, was introduced, which built upon the success of the earlier versions, and supplied many POSIX features and commands, a POSIX API, and was POSIX certified. Due to its more "open" architecture, it found even wider acceptance in the industry, eventually finding itself in TV set-top web browsers, deeply embedded control systems, and so on.

In 1997, QNX/Neutrino was introduced, which carried on the traditions established by the earlier versions, including much more comprehensive POSIX support, POSIX threads, and better scalability (for even smaller embedded systems and even larger high-end systems).

In this article, I'd like to give you a guided tour of QNX/Neutrino's functionality — not only from the perspective of what's "new and improved", but also what was kept in the design ("if it ain't broken, don't fix it").

Threads

One of the notable changes that QNX/Neutrino has over previous versions is POSIX "system scope" thread support [the ability to run multiple flows of execution through a process; "system scope" refers to the fact that all threads compete directly against each other for CPU resources] While QNX 4 had rudimentary thread support, QNX/Neutrino's thread support was conscientiously designed into the kernel as the minimal unit of schedulable execution. This greatly simplifies the kernel design, because the kernel only has to worry about scheduling threads.

Size & Scalability

QNX operating system products have always adopted a "small is good" philosophy. QNX/Neutrino is no exception. The QNX/Neutrino kernel is about 32k of code, and the process manager is also about 32k of code. This dispels the common myth that any useful POSIX-compliant system must be "big". The kernel supports message passing, signals, interrupt handlers, timers, clocks, and thread level scheduling; while the process manager's 32k of code adds memory allocation, process contexts (memory protection), resource manager namespaces, and other "process" extensions to the kernel.

This has allowed QNX/Neutrino to fit into even more-severely memory constrained applications where previous versions of QNX could not.

While fitting a full, POSIX operating system into small amounts of space is certainly impressive, QNX/Neutrino has scored another first with its scalability on the high end. The same operating system will work on tiny embedded boxes, right up through to large, symmetrical multi-processor (SMP) boxes. An advantage is that no additional learning time has to be incurred by development staff in order to scale up or down.

As an OS development group, it also means that QSSL does not need to development, maintain, tech support, and do QA on a large number of OS versions. The effort applied to a single OS scales across a wide range of application domains.

Message Passing

All QNX operating systems have had message passing. QNX/Neutrino has kept message passing in basically the same form as previous versions of QNX, making just a few tweaks here and there.

Since message passing is fundamental to understanding the QNX operating system and its advantages, we should look at this topic in more depth.

Message passing refers to the mechanism used by one thread to communicate with another thread. The important (and perhaps subtle) point here is that the two threads may be in the same process, may be in different processes on the same machine, or may even be on different machines connected by a LAN — it makes no difference. With message passing, the two communicating threads assume a client/server relationship.

The client thread constructs a message, and sends it to the server thread. At this point, the client thread blocks (it generally enters the REPLY blocked state — meaning, it's waiting for a "reply" from the server). The server thread receives the message from the client, and performs some kind of processing (depending upon the message content). When the server is finished, it replies to the client with the results. At this point, the client unblocks, and runs.

Note that the kernel performed the message transfer from the client to the server, and back. On reflection, this makes sense, because in a memory protected model, only the kernel would have access to both the client and server's memory.

Now, why is message passing important?

Message passing allows you to design your target "system" as a set of cooperating processes. Each individual process has a fixed area of responsibility, and may provide and/or request services to/from other processes. What this really means is that if you carefully consider the impacts of message passing early on in the design phase, you can design a system in which programs are decoupled from each other, allowing much easier integration and unit testing, upgrading, and even multi-node distribution.

Message passing is the underlying implementation model of the system. Therefore, some of the POSIX and ANSI C library functions that you use in your programs (like "fopen ()", "lseek ()", "write ()", and others) will use message passing on your behalf, transparently to you.

The main advantage of message passing gets back to our point above about who can use message passing. Message passing works consistently regardless of the location of the two threads that wish to perform message passing.

Let me illustrate this by way of an example.

In a conventional operating system, to open and write to a file on a local hard disk, the C library would make a kernel call on your behalf, the kernel would call into the filesystem driver, and the filesystem driver would then open the file and write the data to disk. However, to open and write to a file on a remote filesystem, the C library still makes a kernel call, but this time, the kernel decides that the instructions about opening the file and writing the data need to be sent to a different node. So a message is formatted, and sent out over the LAN to another kernel, which then handles the request. However, this has resulted in a double standard — actions performed in the local case are performed one way, and actions performed in a remote case are handled in another, completely different, manner.

Contrast that to what happens under the QNX operating system. Under QNX, a program decides to open and write to a file. In the local case, the C library generates an open message and a write message, and sends them to the filesystem process. The kernel transfers the message from the client (the C program) to the server (the filesystem), and the filesystem performs the work. In the remote case, the exact same thing happens, except that the kernel detects that the target of the message (the server) is on a different node, and transmits the message there instead. The nice thing about this approach is that the exact same server interface is usable by clients on the same node or on different nodes — no further (or "extra") work needs to be done to handle one case or the other.

This can be a very powerful feature when a design has to be expanded to span multiple nodes. This might be required if the hardware that the nodes are controlling is in physically separate areas of a large building. Or, if the number of processes that need to be run on a particular node exceeds the CPU "horsepower", the processes can be distributed over multiple CPUs.

QNX/Neutrino's Openness

Another interesting aspect of QNX/Neutrino is the amount of work that has gone into making it "open". If you look at the history of QSSL's operating systems, each release became more and more open; meaning that the programmer could extend the system in a consistent and transparent manner.

While a significant part of the effort of making something "open" can be in the documentation, this isn't where the most R&D effort was spent for QNX/Neutrino.

Documenting a poor design results in an open system, but one which is not very usable. QNX/Neutrino's design was based on the concept of keeping things simple, and allowing true open-ness — the ability for third parties to customize the operating system by adding, removing, and tailoring pieces as they see fit.

Every function call used to the supplied drivers is documented and available for developers to extend the system.

Since QNX/Neutrino is scalable for the very small end of the spectrum, the message passing design allows components (such as serial port drivers, disk drivers and filesystems, etc) to be dynamically installed at runtime. This allows a larger development and debugging system to be created, with certain parts "trimmed off" for field deployment.

And finally, while QNX/Neutrino is currently only available for the x86 series of processors, its design makes it portable to other processors (for example, the kernel is coded almost entirely in C). This was a deliberate business decision made by QSSL to minimize the amount of work required, should an opportunity present itself for porting to another processor platform.