Distributed-Memory Parallel Algorithms for Sparse Matrix and Sparse Tall-and-Skinny Matrix Multiplication

Ranawaka, Isuru; Hussain, Md Taufique; Block, Charles; Gerogiannis, Gerasimos; Torrellas, Josep; Azad, Ariful

doi:10.1109/SC41406.2024.00052

Citation Details

Distributed-Memory Parallel Algorithms for Sparse Matrix and Sparse Tall-and-Skinny Matrix Multiplication

We consider a sparse matrix-matrix multiplication (SpGEMM) setting where one matrix is square and the other is tall and skinny. This special variant, TS-SpGEMM, has important applications in multi-source breadth-first search, influence maximization, sparse graph embedding, and algebraic multigrid solvers. Unfortunately, popular distributed algorithms like sparse SUMMA deliver suboptimal performance for TS-SpGEMM. To address this limitation, we develop a novel distributed-memory algorithm tailored for TS SpGEMM. Our approach employs customized 1D partitioning for all matrices involved and leverages sparsity-aware tiling for efficient data transfers. In addition, it minimizes communication overhead by incorporating both local and remote computations. On average, our TSSpGEMM algorithm attains 5x performance gains over 2D and 3D SUMMA. Furthermore, we use our algorithm to implement multi-source breadth-first search and sparse graph embedding algorithms and demonstrate their scalability up to 512 Nodes (or 65,536 cores) on NERSC Perlmutter. more »

Award ID(s):: 2534902

PAR ID:: 10614649

Author(s) / Creator(s):: Ranawaka, Isuru; Hussain, Md Taufique; Block, Charles; Gerogiannis, Gerasimos; Torrellas, Josep; Azad, Ariful

Publisher / Repository:: IEEE

Date Published:: 2024-11-17

ISBN:: 979-8-3503-5291-7

Page Range / eLocation ID:: 1 to 17

Format(s):: Medium: X

Location:: Atlanta, GA, USA

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript
Conference Paper:
https://doi.org/10.1109/SC41406.2024.00052

More Like this