Ceva introduces neural processing unit optimized for on-device AI

Ceva has expanded its existing NeuPro family of neural processing units with the addition of NeuPro Nano, specifically tailored for TinyML (tiny machine learning) workloads, including object detection, face detection, and other complex on-device AI use cases. These NPUs bring a balance between power consumption and performance while occupying a compact footprint.
Ceva cites a market research report from ABI Research that anticipates more than 40 percent of TinyML shipments to be supported by dedicated hardware by 2030. In anticipation of this market trend, Ceva is developing specialized hardware systems capable of integrating edge AI capabilities.
The NeuPro Nano offers a performance range of 10 to 200 GOPS (Giga Operations Per Second) per core.
“Ceva-NeuPro-Nano opens exciting opportunities for companies to integrate TinyML applications into low-power IoT SoCs and MCUs and builds on our strategy to empower smart edge devices with advanced connectivity, sensing, and inference capabilities,” says Chad Lucien, vice president and general manager of the Sensors and Audio Business Unit at Ceva.
The Ceva embedded AI NPU architecture offers programmability, enabling the system to perform a variety of tasks such as neural networks, feature extraction, control code, and DSP (Digital Signal Processing) code. This architecture also supports many machine data types and operators, including native transformer computation, sparsity acceleration, and fast quantization.
Ceva has integrated its NetSqueeze AI compression, which directly processes compressed model weights, eliminating the need for intermediate decompression stages. The company asserts that this leads to up to an 80 percent reduction in memory footprint, an important capability for TinyML applications.
Ceva ships the NeuPro Nano NPU in two configurations — NPN32 and NPN64. The NPN32 features 32 int8 multiply-accumulate units (MACs) designed for voice, audio, object detection, and anomaly detection workloads. In contrast, the NPN64 is equipped with 64 int8 MACs, making it suitable for more complex on-device applications such as speech recognition, health monitoring, and potentially biometrics.
NeuPro Studio, a comprehensive AI software development kit (SDK) that supports open AI frameworks like TensorFlow Lite for Microcontrollers (TFLM) and microTVM (µTVM), further optimizes NeuPro NPUs’ efficiency.
Ceva launched upgraded digital signal processors that support face or voice biometrics among a wide range of uses in 2021.
Comments