New benchmark suits multicore processors

August 11, 2016 OpenSystems Media

Even if you’re just using a single-core microcontroller in your system design, chances are good that you aren’t running just one function or kernel. This is even more true for a multicore processor. Hence, the performance benchmarks that you run to compare and contrast the various devices or analyze the capability of your system should also be comprised of more complex multi-function and parallel workloads.

What does this mean exactly? From a theoretical perspective, it means that the system should be utilizing at least a minimal scheduler to help coordinate the execution of the workload’s components (the scheduler could also be part of an RTOS or more advanced OS such as Linux).

From a practical perspective, take a look at the original EEMBC AutoBench benchmark, comprised of 16 single-function, serial-coded kernels (this version of AutoBench dates back to 1999, long before multicore processors were the norm). Does this benchmark run faster on a 1-GHz single-core processor or a 250-MHz quad-core processor? The simple answer is that the former processor would be 4X faster because the out-of-the-box AutoBench would only run on one core because each kernel is single-threaded.

While it’s possible to run the AutoBench kernels in multicore mode by instructing the operating system to launch multiple instances of each kernel, it’s more realistic to run more complex workloads that are subdivided as separate threads. For this reason, EEMBC recently launched AutoBench 2.0, the multicore version that integrates with the consortium’s MultiBench tool.

With the increasing adoption of multicore technology into automotive applications, AutoBench 2.0 provides an important performance metric for system designers testing the efficacy of multicore processors. To demonstrate the effectiveness of the new benchmark, we ran the workloads on a Linux-based Intel Xeon (yes, I know this isn’t an automotive processor, but it provides an easy-to-use test platform).

Results show that when running on 1, 2, 4, and 8 cores, the geometric mean of all workload scores goes from 275, 519, 785, and 1108, respectively. This represents a scaling of 1.9, 2.9, and 4.0, demonstrating that the more workload contexts that are enabled, the more overhead that’s brought into play. From a benchmark perspective, this is a good thing. In other words, it wouldn’t be a very good multicore benchmark if the processor results scaled linearly with the number of cores.

Markus Levy is president of EEMBC, which he founded in April 1997. As president, he manages the business, marketing, press relations, member logistics, and supervision of technical development. Mr. Levy is also president of the Multicore Association, which he co-founded in 2005.

Markus Levy, EEMBC
Previous Article
System in Package: The next step of integration
System in Package: The next step of integration

In the semiconductor world, integration has been the solution to every problem. Each time semiconductor tec...

Next Article
Tour other workspaces for inspiration and new techniques

When I worked in manufacturing, it was always fun and educational to visit other facilities to see equipmen...