Abstract
System-on-Chip (SoC) architectures for edge devices and low-end high-performance computing (HPC) integrate multicores and heterogeneity to handle a wide range of control and data processing tasks to meet the increasing computational demands of modern Deep Learning (DL) workloads efficiently. The primary design challenges include compute bottlenecks due to the high complexity of deep neural networks (DNNs), memory bandwidth limitations that dominate power consumption and reduce throughput, and scalability issues as integrating additional compute cores or dedicated AI accelerators increases silicon area and energy costs. Digital In-Memory Computing (D-IMC) is a promising paradigm that mitigates these limitations by enabling computation directly within memory macros, significantly reducing data movement and improving energy efficiency. Among different IMC technologies, SRAM-based D-IMC provides key advantages over non-volatile memory (NVM) implementations, including faster weight loading, reconfigurability, and compatibility with standard CMOS fabrication. However, integrating D-IMC into a scalable, modular, and workload-flexible SoC architecture for both edge AI and low-end HPC remains an open research challenge.
- Computer Science and Engineering
- Thematic