EdgeCortix Launches SAKURA-II Platform to Support Generative AI at the Edge
June 04, 2024
News
Tokyo, Japan — EdgeCortix® Inc. unveiled its next-generation SAKURA-II Edge AI accelerator. Paired with the company’s second-generation Dynamic Neural Accelerator (DNA) architecture, SAKURA-II helps users manage Large Language Models (LLMs), Large Vision Models (LVMs), and multi-modal transformer-based applications.
Well-suited for numerous use cases across the manufacturing, industry 4.0, security, robotics, aerospace, and telecommunications industries, SAKURA-II features EdgeCortix’s latest generation runtime reconfigurable neural processing engine, DNA-II. Leveraging this highly configurable intellectual property block, SAKURA-II delivers power efficiency and real-time processing capabilities while executing multiple deep neural network models with low latency.
According to the company, SAKURA-II can provide up to 60 trillion operations per second (TOPS) of effective 8-bit integer performance and 30 trillion 16-bit brain floating-point operations per second (TFLOPS), while also supporting built-in mixed precision for handling the demands of next-generation AI tasks.
The SAKURA-II platform, with its sophisticated MERA software suite, features a heterogeneous compiler platform, advanced quantization, and model calibration capabilities. This software suite includes native support for leading development frameworks such as PyTorch, TensorFlow Lite, and ONNX. MERA’s flexible host-to-accelerator unified runtime is adept at scaling across single, multi-chip, and multi-card systems at the edge, streamlining AI inferencing and shortening deployment times. Furthermore, the integration with the MERA Model Library, with an interface to Hugging Face Optimum, offers users access to a range of the latest transformer models, ensuring a smooth transition from training to edge inference.
Key Benefits of SAKURA-II include:
- Optimized for Generative AI: Tailored specifically for processing Generative AI workloads at the edge with minimal power consumption.
- Complex Model Handling: Capable of managing multi-billion parameter models like Llama 2, Stable Diffusion, DETR, and ViT within a typical power envelope of 8W.
- Seamless Software Integration: Fully compatible with EdgeCortix’s MERA software suite, facilitating seamless transitions from model training to deployment.
- Enhanced Memory Bandwidth: Offers up to four times more DRAM bandwidth than competing AI accelerators, ensuring superior performance for LLM and LVM.
- Real-Time Data Streaming: Optimized for low-latency operations under real-time data streaming conditions.
- Advanced Precision: Provides software-enabled mixed-precision support for near FP32 accuracy.
- Sparse Computation: Supports sparse computation to reduce memory footprint and optimize bandwidth.
- Versatile Functionality: Supports arbitrary activation functions with hardware approximation for enhanced adaptability.
- Efficient Data Handling: Includes a dedicated Reshaper engine to manage complex data permutations on-chip and minimize host CPU load.
- Power Management: Features on-chip power-gating and power management capabilities to facilitate ultra-high efficiency modes.
SAKURA-II will be offered as a stand-alone device, on two different M.2 modules with varying DRAM capacity, and single and dual-device low-profile PCIe cards. Customers can reserve M.2 modules and PCIe cards today for delivery in the second half of 2024.
Reserve SAKURA-II accelerators today by registering here.
For more information about EdgeCortix and SAKURA-II, visit https://www.edgecortix.com/en/