PIMnet: A Domain-Specific Network for Efficient Collective Communication in Scalable PIM

Son, Hyojun; Jonatan, Gilbert; Wu, Xiangyu; Cho, Haeyoon; Shivdikar, Kaustubh; Abellán, José L; Joshi, Ajay; Kaeli, David; Kim, John

doi:10.1109/HPCA61900.2025.00116

Citation Details

This content will become publicly available on March 1, 2026

PIMnet: A Domain-Specific Network for Efficient Collective Communication in Scalable PIM

Processing-in-memory (PIM), where compute is moved closer to memory or data, has been explored to accelerate emerging workloads. Different PIM-based systems have been announced, each offering a unique microarchitectural organization of their compute units, ranging from fixed functional units to programmable general-purpose compute cores near memory. However, one fundamental limitation of PIM is that each compute unit can only access its local memory; access to “remote” memory must occur through the host CPU – potentially limiting application performance scalability. In this work, we first characterize the scalability of real PIM architectures using the UPMEM PIM system. We analyze how the overhead of communicating through the host (instead of providing direct communication between the PIM compute units) can become a bottleneck for collective communications that are commonly used in many workloads. To overcome this inter-PIM bank communication, we propose PIMnet – a PIM interconnection network for PIM banks that provides direct connectivity between compute units and removes the overhead of communicating through the host. PIMnet exploits bandwidth parallelism where communication across the different PIM bank/chips can occur in parallel to maximize communication performance. PIMnet also matches the DRAM packaging hierarchy with a multi-tier network architecture. Unlike traditional interconnection networks, PIMnet is a PIM controlled network where communication is managed by the PIM logic, optimizing collective communications and minimizing the hardware overhead of PIMnet. Our evaluation of PIMnet shows that it provides up to 85× speedup on collective communications and achieves a 11.8× improvement on real applications compared to the baseline PIM. more »

Award ID(s):: 2312275 2312276

PAR ID:: 10612701

Author(s) / Creator(s):: Son, Hyojun; Jonatan, Gilbert; Wu, Xiangyu; Cho, Haeyoon; Shivdikar, Kaustubh; Abellán, José L; Joshi, Ajay; Kaeli, David; Kim, John

Publisher / Repository:: IEEE

Date Published:: 2025-03-01

Journal Name:: Proceedings

ISSN:: 2378-203X

ISBN:: 979-8-3315-0647-6

Page Range / eLocation ID:: 1557 to 1572

Format(s):: Medium: X

Location:: Las Vegas, NV, USA

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
This content will become publicly available on March 1, 2026
Conference Paper:
https://doi.org/10.1109/HPCA61900.2025.00116

More Like this