New benchmark suits multicore processors

August 11, 2016

Blog

New benchmark suits multicore processors

Even if you're just using a single-core microcontroller in your system design, chances are good that you aren't running just one function or kernel. T...

Even if you’re just using a single-core microcontroller in your system design, chances are good that you aren’t running just one function or kernel. This is even more true for a multicore processor. Hence, the performance benchmarks that you run to compare and contrast the various devices or analyze the capability of your system should also be comprised of more complex multi-function and parallel workloads.

What does this mean exactly? From a theoretical perspective, it means that the system should be utilizing at least a minimal scheduler to help coordinate the execution of the workload’s components (the scheduler could also be part of an RTOS or more advanced OS such as Linux).

From a practical perspective, take a look at the original EEMBC AutoBench benchmark, comprised of 16 single-function, serial-coded kernels (this version of AutoBench dates back to 1999, long before multicore processors were the norm). Does this benchmark run faster on a 1-GHz single-core processor or a 250-MHz quad-core processor? The simple answer is that the former processor would be 4X faster because the out-of-the-box AutoBench would only run on one core because each kernel is single-threaded.

While it’s possible to run the AutoBench kernels in multicore mode by instructing the operating system to launch multiple instances of each kernel, it’s more realistic to run more complex workloads that are subdivided as separate threads. For this reason, EEMBC recently launched AutoBench 2.0, the multicore version that integrates with the consortium’s MultiBench tool.

With the increasing adoption of multicore technology into automotive applications, AutoBench 2.0 provides an important performance metric for system designers testing the efficacy of multicore processors. To demonstrate the effectiveness of the new benchmark, we ran the workloads on a Linux-based Intel Xeon (yes, I know this isn’t an automotive processor, but it provides an easy-to-use test platform).

Results show that when running on 1, 2, 4, and 8 cores, the geometric mean of all workload scores goes from 275, 519, 785, and 1108, respectively. This represents a scaling of 1.9, 2.9, and 4.0, demonstrating that the more workload contexts that are enabled, the more overhead that’s brought into play. From a benchmark perspective, this is a good thing. In other words, it wouldn’t be a very good multicore benchmark if the processor results scaled linearly with the number of cores.

Markus Levy is president of EEMBC, which he founded in April 1997. As president, he manages the business, marketing, press relations, member logistics, and supervision of technical development. Mr. Levy is also president of the Multicore Association, which he co-founded in 2005.

Markus Levy, EEMBC
Categories
Processing