SGAP: A Scatter-Gather Processing-in-Memory Architecture with Fixed-Point Reconfiguration for Energy-Efficient DNNs
Jihe Wang, Jiawang Cao, Feilong Qiaoyuan, Zhaoqing Wang, Jianfeng An, Danghui WangLeveraging the robustness of DNNs, fixed-point mixed-precision models have effectively reduced hardware costs in edge in-memory accelerators. However, existing studies on multipliers are poorly suited to mixed-precision models. On the one hand, most of them employ fixed bit-widths or offer only limited configurations, lacking the fine-grained reconfigurability required by mixed-precision models. On the other hand, layers with smaller retained bit-widths can tolerate larger multiplication errors to achieve lower hardware costs, as the inaccurate bits are discarded. However, existing approximate multipliers cannot achieve low-cost dynamic error control, resulting in this optimization potential being wasted. To address these issues, this study proposes a fixed-point approximate multiplier with fine-grained reconfigurability. By breaking full multiplication into small 2-bit/3-bit operations and discarding non-critical ones selectively, the proposed multiplier ensures high hardware utilization across varying bit-widths while enabling low-cost fine-grained error control. Building on this multiplier, a DDR5-oriented near-memory accelerator architecture is proposed, incorporating extended DRAM commands and semi-SIMD scheduling schemes. Compared to state-of-the-art PIM architectures, the proposed architecture achieves a reduction of 40.8% in area, 16.3% in power consumption, and 6.5% in latency, respectively.