Making the switch to RapidIO

By Paul N. Leroux

By shuttling data up to 10 Gbps, the RapidIO interconnect promises to break the bandwidth bottleneck that legacy buses such as PCI impose. In fact, when compared to conventional shared buses, RapidIO offers a number of advantages, including lower latency, a lower pin count, a smaller silicon footprint, and a point-to-point architecture that ensures both greater concurrency and higher fault tolerance. This article points out the many advantages of a RapidIO interconnect system.

RapidIO is a within-the-box switched fabric optimized for chip-to-chip and board-to-board short distance communications. It can, however, work in tandem with longer-range interconnects. For instance, a design could use InfiniBand to connect multiple storage systems together, while using RapidIO to manage the data flow within each system. RapidIO can also work with PCI to simplify migration from existing designs. For instance, designers can use it to create a PCI-to-PCI bridge that has fewer pins, yet offers far more bandwidth, than conventional bridges.

The software challenge
While RapidIO provides a fast, reliable hardware interconnect, system designers still must find or develop software that can fully realize the benefits of RapidIO’s advanced features, in particular, its support for distributed, multiprocessor systems. The problem isn’t with RapidIO itself, which is generally software-transparent, but with software that can’t easily migrate to high-speed distributed designs. Sometimes the problem lies with applications that rely on hardware-specific features or protocols. In many cases, however, the problem isn’t with the application software, but with its underlying operating system (OS).

Consider what happens whenever an application needs to access a system service, such as a device driver, file system, or protocol stack. In most OSs, such services run in the kernel and, as a result, applications must use OS kernel calls to access them. Since kernel calls don’t cross processor boundaries, the services are available only to applications on the local CPU. It thus becomes difficult, if not impossible, to distribute an application and its related services across the multiple processors of a RapidIO system; they’re effectively locked together on the same processor.

Microkernel OSs offer a way out of this dilemma. In a microkernel OS, only the most fundamental OS primitives (e.g., threads, mutexes, timers) run in the kernel itself. All other services, including drivers and protocol stacks, run outside of the kernel as separate, memory-protected processes. Since these services don’t run in the kernel, applications don’t have to access them via kernel calls. Instead, applications can use message passing, a form of IPC that, when properly implemented, can flow across processor boundaries. Consequently, an application can access virtually any remote OS service, simply by sending it appropriate messages (see Figure 1). In fact, a well-designed, message-passing mechanism can make the location of that service fully transparent to applications. This transparency allows applications to discover the service dynamically, and allows developers to implement the service in a load-balanced, redundant fashion. (See Figure 2.)



Figure 1. In a microkernel OS, system services such as file systems, protocol stacks, and device drivers reside outside of the kernel. Applications can, as a result, use message passing to access virtually any system service, regardless of whether the service resides on the same processor or a different one.



Figure 2.
As this scenario illustrates, the location transparency provided by a message-passing OS simplifies the design of redundant, fault-tolerant systems based on RapidIO. Issues such as which OS service handles a client application's request, where that service is located, and whether multiple services may process the request (such as when a service is duplicated for load-balancing) are all neatly abstracted from the application. Click to zoom.

Still, most developers balk at the thought of having to develop such a messaging layer, or even of dealing with a proprietary messaging API. As it turns out, however, a microkernel OS can encapsulate this distributed message passing within standard, universally known POSIX calls, such as open(), read(), write(), and lseek(). There’s no need to master the intricacies of a complex messaging protocol.

Using the QNX Neutrino microkernel OS as an example, let’s say an application wants to send a message to a device driver, asking it to write some data. To do this, the application can simply issue a POSIX write() call on a symbolic name identifying that driver. The underlying C library, not the application, will then convert that call into a write message. At this point, one of two things will happen. If the driver is local, the OS microkernel will route the message directly. However, if the driver is another processor, then a network manager – an OS process dedicated to forwarding remote messages – will send the message to that processor. Either way, the OS takes care of resolving where the message should go.

In effect, it doesn’t matter whether the driver is local or remote. The application can send the exact same message, using the exact same code, in either case. Additionally, since this same message-passing model can apply to all services in a microkernel OS, the application can transparently access virtually any resource, regardless of its location.

Managing redundant links
To enable fast, predictable response times, the RapidIO architecture offers both determinism (through multiple message-priority levels) and low latency. RapidIO can maintain this low latency even under heavy network loads, thanks to several features. For instance, if a link is busy, a packet doesn’t have to return to the original source device; rather, it simply waits for the link to become available. In addition, if the network load is particularly heavy, RapidIO allows the system designer to implement multiple links that, together, provide extremely high bandwidth. These links can also provide greater fault tolerance: if one link becomes unavailable, the system can re-route the data over one or more of the remaining links.

Nonetheless, conventional OS architectures don’t offer built-in support for multiple redundant links between processors, whether those links are based on RapidIO or some combination of RapidIO, fiber, Ethernet, serial, and so on. Consequently, it’s up to the software developer to implement support for each link, a task the developer must do by hand on an application-by-application basis. For instance, each application might have to specify the primary link it wishes to use, along with any alternate links and how to use those links. To complicate matters, the number of links, the kinds of links, and the policy for each link (e.g., use link x for failover, link y for load balancing) can change from device to device, or even from installation to installation. In fact, each pair of processes talking across the links might require a different policy.

Good design dictates a higher level of abstraction, where application software doesn’t have to be concerned with managing such low-level issues. In an OS where message passing forms the central method of IPC, much of this abstraction is already in place. As noted, a dedicated network manager, rather than the applications, can handle the job of passing messages to remote processors. It’s logical, therefore, to go one step further and have that same manager control the flow of messages over redundant links. That way, the application only has to request the appropriate policy that the network manager provides (e.g., “Use an alternate link if my preferred link becomes unavailable”), and the network manager will take care of monitoring links, redirecting traffic, and administering other low-level tasks. In effect, the network manager hides the dirty work from the application. This is the approach that the QNX Neutrino RTOS takes, in which a network manager provides several options to control traffic across redundant links, allowing the software developer to boost throughput or fault tolerance, or both (Figure 3).



Figure 3. The abstraction provided by message passing allows applications to communicate over redundant network links in a RapidIO system. The job of deciding which message should flow over which link, and when, can all be handled by OS services.

A complementary advantage for systems design
It is evident that a message-passing microkernel RTOS can eliminate much of the complexity in building a distributed RapidIO system. There are, however, other ways in which a microkernel OS and RapidIO can work together to improve the design of networking equipment. For instance, because microkernel architecture encapsulates every service and every application in a separate memory-protected process, a system can recover from errors in almost any software module. A microkernel architecture can even dynamically restart a faulty device driver, without a system reset or user intervention. This software fault-tolerance complements RapidIO’s ability to recover from hardware errors, again without intervention. As a result, systems using both a microkernel OS and the RapidIO interconnect can achieve greater uptime and reliability with little additional effort on the part of the systems designer.

. . . . .

Paul Leroux
is a technology analyst at QNX Software Systems, where he has served in various roles since 1990. His areas of focus include OS architecture, high availability systems, and integrated development environments.

For more information about QNX Neutrino, visit the QNX website at www.qnx.com.