COM-HPC Academy: Heterogeneous Computing on COM-HPC

December 16, 2020

Video

I’ve written a bunch of blogs lately touting the features of computer-on-modules (COMs), specifically regarding the new high-performance specification, COM-HPC. One topic that that hasn’t been covered is how you control, monitor, and manage those heterogeneous computing modules. So that’s what I’ll do here, although lots more information can be gleaned from a video that features Jens Hagemeyer and Martin Kaiser, a pair of research associates at Bielefeld University in Germany.

Module management was developed in a COM-HPC subgroup. The group made some significant changes to what’s perceived to be the predecessor spec, COM Express. That earlier standard employs a module centric management architecture. COM-HPC supports a multi-module architecture, even with modules housed on the same carrier board. Extra consideration should be taken when multiple processor modules are used, but the practice does stay within the specification’s guidelines. While COM-HPC modules will likely be based on the latest microprocessors, they can also be designed with FPGAs or GPUs.

Module Management

The basic elements of module management are similar to COM Express. Then, you can add a module management controller (MMC), which provides that management capability. Hence, you can provide management from either the COM or the carrier board. Various configurations are available, including one that provides faster monitoring from the MMC. That control can also extend to the PCI Express or USB connection, and even I2C, if desired.

In this scenario, the Redfish API, which can be a replacement for the Intelligent Platform Management Interface (IPMI), can provide the system management functionality in a relatively simple and secure manner. In fact, PICMG, the standards body overseeing COM-HPC, is recommending Redfish as the external management interface. IPMI may be supported as well, but it's considered as a legacy interface, potentially used for internal communications between modules.

COM-HPC and Heterogeneous Computing

Bear in mind that the last revision of COM Express (R3) was more than three years ago, and it was aimed mostly at modules designed with x86-based microprocessors. Because many alternatives have since hit the market, like those based on Arm and RISC-V, as well as FPGAs and GPUs, there was clearly a need for an update. These massively parallel processing devices offer high acceleration and increased energy efficiency for emerging applications like AI, computer vision, and things looking to take advantage of 5G.

“It was clear where the COM-HPC journey should go,” says Kaiser. “We want to support all of these architectures to cover the widest range of applications.”

The first use case (upper image) shows a CPU in the same environment with GPU- and FPGA-based accelerators. The second use case (lower image) adds PCIe as an accelerator.

Interacting with those emerging applications will be easier due to the use of open standards and interfaces. For added flexibility, the COM-HPC module can even be used as a PCI Express endpoint or target device. As a result, different heterogeneous modules can be coupled together to create more flexible architectures.

Finally, you may be wondering like I was, why a university, with no physical products to sell, would want to expend so many resources helping to develop an industry standard. It turns out that the University of Bielefeld designs many of its own modules for in-house computing platforms. If they could interchange and interoperate with commercially available modules, that would simplify the process and potentially save the university both time and money. Hence, they wanted to be sure that the new standard would meet its needs for heterogeneous computing.