Renesas Processing-In-Memory Technology Delivers 8.8 TOPS/W AI Performance

June 13, 2019 Brandon Lewis

Renesas has developed an AI accelerator that achieved 8.8 TOPS/W when processing CNN algorithms on a test chip. The accelerator is based on a processing-in-memory (PIM) architecture that performs multiply-accumulate (MAC) operations in a memory circuit as data is being read from the circuit. It will be part of the company’s family of embedded AI (e-AI) edge compute solutions.

The PIM accelerator integrates:

  • A ternary (-1, 0, 1) SRAM structure – Along with a digital calculation block that minimizes calculation errors, the ternary structure allows the accelerator to switch between bit calculation based on the required accuracy/resolution (for example, 1.5 bit (ternary) and 4-bits). This also allows users to balance accuracy and power consumption.
  • A specialized SRAM circuit that reads memory at low power – A 1-bit comparator and replica cell allows the SRAM current to be controlled, resulting in a precision memory readout circuit. The circuit also ceases operation altogether when it encounters a neural network node (neuron) that is not activated.
  • Technology that prevents calculation errors due to manufacturing process variations – To offset process variations that can cause errors in the values of SRAM bit line currents, the interior of the chip was lined with multiple SRAM calculation circuit blocks. Node calculations are then selectively allocated to blocks with the least process variation. Renesas believes this reduces calculation errors such that they become essentially irrelevant.

Together, the three technologies minimize memory access time as well as the power consumption of MAC operations. The ternary, as opposed to binary, SRAM structure also enables high accuracy in large-scale CNN workloads.

According to Renesas, the resulting 8.8 TOPS/W is the industry’s highest power efficiency at an accuracy ratio of more than 99 percent. The company reportedly demonstrated this performance at the 2019 Symposia on VLSI Technology and Circuits, where the test chip was connected to a small battery and various input peripherals.

Renesas classifies the accelerator’s performance per watt as an enabler of incremental learning directly on endpoints. For more information on the company’s e-AI technology, visit



About the Author

Brandon Lewis

Brandon Lewis, Editor-in-Chief of Embedded Computing Design, is responsible for guiding the property's content strategy, editorial direction, and engineering community engagement, which includes IoT Design, Automotive Embedded Systems, the Power Page, Industrial AI & Machine Learning, and other publications. As an experienced technical journalist, editor, and reporter with an aptitude for identifying key technologies, products, and market trends in the embedded technology sector, he enjoys covering topics that range from development kits and tools to cyber security and technology business models. Brandon received a BA in English Literature from Arizona State University, where he graduated cum laude. He can be reached by email at

Follow on Twitter Follow on Linkedin Visit Website More Content by Brandon Lewis
Previous Article
UnitedSiC Releases SiC JFET Family for Low-Power AC-DC Flyback Converters

UnitedSiC released a range of SiC JFET die, suitable for co-packaging with a controller IC with built in lo...

Next Article
TAIYO YUDEN Reduces the Thickness of 1005-Size 3-Terminal Multilayer Ceramic Caps by 23 Percent

With a lower ESL than two-terminal devices, the caps can reduce impedance in the high-frequency range and c...