3D Integrated DNN Accelerators: Recent Trends and Future Prospects
Abrar Abdurrob, Aristotelis Tsekouras, Evangelos Tzouvaras, Vasilis F. Pavlidis, Emre SalmanThe rapid growth of Deep Neural Networks (DNNs) has led to the development of application-specific DNN accelerators. Conventional 2D von Neumann architectures suffer from memory bandwidth limitations between the memory and the processing core. 3D DNN accelerators have emerged as a promising solution by leveraging 3D integration to enable near-memory logic or in-memory computation. By shifting computation closer to memory, these accelerators significantly reduce data movement and therefore latency, resulting in more energy-efficient operations. Monolithic 3D (M3D) integration, in particular, enables high-bandwidth systems by utilizing high-density monolithic inter-tier vias (MIVs). This paper provides a critical review of recent advances in 3D DNN accelerators that combine near-memory and compute-in-memory with various 3D technologies, offering a useful discussion and future prospects of the available technologies and architectures that have advanced the performance of DNN accelerators. Particular attention is devoted to accelerators for emerging transformer-based large language model (LLM) networks due to the higher memory demands. Thermal-aware design techniques of 3D DNN accelerators are also discussed as a means to address the fundamental challenge of heat dissipation. A detailed review is finally conducted on package-level constraints, considering signal integrity, power delivery, and thermo-mechanical reliability.