Flex Logix Announces Working Silicon of AI Edge Inference Chip

By Tiera Oliver

Associate Editor

Embedded Computing Design

October 22, 2020

News

Flex Logix Announces Working Silicon of AI Edge Inference Chip

InferX X1 offers ideal performance of neural network models such as object detection and recognition, and other neural network models, for robotics, industrial automation, medical imaging, and more.

Flex Logix announced availability of its InferX X1, an AI inference chip for edge systems. The InferX X1 offers ideal performance of neural network models such as object detection and recognition, and other neural network models, for robotics, industrial automation, medical imaging, and more.

Customers can use YOLOv3 in their products in robotics, bank security, and retail analytics because it offers ideal accuracy for object detection and recognition algorithm. Additionally, customers can use YOLOv3 for custom models they have developed for a range of applications that need more throughput at lower cost. Flex Logix has benchmarked models for these applications and demonstrated to these customers that InferX X1 provides the needed throughput and lower cost.

The InferX X1 silicon area is 54mm2 which is 1/5th the size of a US penny. InferX X1’s high-volume price enables high-quality, high-performance AI inference to be implemented in mass market products.

InferX X1’s software makes it ideal to adopt. The InferX Compiler takes models in TensorFlow Lite or ONNX to program the InferX X1.

Based on multiple Flex Logix proprietary technologies, the InferX X1 features a new architecture that, per the company, achieves more throughput from less silicon area. Flex Logix’s XFLX double density programmable interconnect is already used in the eFPGA (embedded FPGA) that Flex Logix has supplied to multiple customers including Dialog, Boeing, Sandia National Labs, and Datung Telecommunications. This is combined with a reconfigurable Tensor Processor consisting of 64 1-Dimensional Tensor Processors that are reconfigurable to implement the wide range of neural network operations. Because reconfiguration can be done in microseconds, each layer of a neural network model can be optimized with full-speed data paths for each layer.

InferX X1 mass production chips and software will be available Q2 2021. Customer samples and advance Compiler and Software Tools will be available in Q1 2021. Customers with Neural Network Models in TensorFlowLite or ONNX with volume applications in 2021 may contact Flex Logix now for performance benchmarking, early sampling and tool access, and detailed specifications and pricing.

Technology Details and Specifications:

High MAC utilization up to 70% for large models/images translates into less silicon area/cost
1-Dimensional Tensor Processors (1D TPUs) are a 1D systolic array

- 64 byte input tensor

- 64 INT8 MACs

- 32 BF16 MACs

- 64 byte x 256 byte weight matrix

- One dimensional systolic array produces an output tensor every 64 cycles using 4096 MAC operations

Reconfigurable Tensor Processor made up of 64 1D TPUs per X1

- TPUs can be configured in series or in parallel to implement a wide range of tensor operations; this flexibility enables high performance implementation of new operations such as 3D convolution

- Programmable interconnect provides a full speed, non-contention data path from SRAM through the TPUs to SRAM

eFPGA programmable logic implements high speed state machines that control the TPUs and implement the control algorithms for the operators
Each layer of a model is configured exactly as needed; reconfiguration for a new layer takes just microseconds
DRAM traffic bringing in the weights and configuration for the next layer occurs in the background during compute of the current layer; this minimizes compute stalls
Combining two layers in one configuration (layer fusion) minimizes DRAM traffic delays.
Minimal memory keeps cost down: LPDDR4x DRAM, 14MB total SRAM
x4 PCIe Gen 3 or Gen 4 provides rapid communication with the host
54 mm2 die size in 16nm process
21 x 21 mm flip-chip Ball Grid Array package

The InferX X1 is sampling soon to selected customers and production is expected in the second quarter of 2021. Pricing ranges based on configuration and volumes from $34 - $199.

For more information, visit: https://flex-logix.com

Tiera Oliver, Associate Editor for Embedded Computing Design, is responsible for web content edits, product news, and constructing stories. She also assists with newsletter updates as well as contributing and editing content for ECD podcasts and the ECD YouTube channel. Before working at ECD, Tiera graduated from Northern Arizona University where she received her B.S. in journalism and political science and worked as a news reporter for the university’s student led newspaper, The Lumberjack.

Embedded Computing Design

Flex Logix Announces Working Silicon of AI Edge Inference Chip

By Tiera Oliver

InferX X1 offers ideal performance of neural network models such as object detection and recognition, and other neural network models, for robotics, industrial automation, medical imaging, and more.

Categories

AI & Machine Learning - AI Logic Devices & Workload Acceleration

IoT - Edge Computing

Processing - Chips & SoCs

Trending Articles

Embedded Testing Vs Software Testing – Key Differences

Efinix Introduces its Titanium Ti375 High-Performance FPGA

Alif Semiconductor Announces BLE and Matter Wireless Microcontroller With Neural Co-Processor for AI/ML Workloads

Doubling Down on 5G: Integration of Two 5G Modules in One Device

2024 embedded world Product Showcase: Synaptics’ SL-Series of Embedded IoT Processors

Analog & Power

Why You Should Consider .8mm PCBs Versus 1.6mm

Industrial

At embedded world, CEVA Accelerates Innovative Connectivity in MCUs and SOCs for IoT and Smart Edge AI Applications

Open Source

Semidynamics Drops its All-In-One AI IP On Us

Security

Embracing FIPS Validation in Medical Device Security