Exploring Data Layout for Sparse Tensor Times Dense Matrix on GPUs

Ahmad, Khalid; Cecka, Cris; Garland, Michael; Hall, Mary

doi:10.1145/3633462

Citation Details

Exploring Data Layout for Sparse Tensor Times Dense Matrix on GPUs

An important sparse tensor computation is sparse-tensor-dense-matrix multiplication (SpTM), which is used in tensor decomposition and applications. SpTM is a multi-dimensional analog to sparse-matrix-dense-matrix multiplication (SpMM). In this article, we employ a hierarchical tensor data layout that can unfold a multidimensional tensor to derive a 2D matrix, making it possible to compute SpTM using SpMM kernel implementations for GPUs. We compare two SpMM implementations to the state-of-the-art PASTA sparse tensor contraction implementation using: (1) SpMM with hierarchical tensor data layout; and, (2) unfolding followed by an invocation of cuSPARSE’s SpMM. Results show that SpMM can outperform PASTA 70.9% of the time, but none of the three approaches is best overall. Therefore, we use a decision tree classifier to identify the best performing sparse tensor contraction kernel based on precomputed properties of the sparse tensor. more »

Award ID(s):: 2107556

PAR ID:: 10515726

Author(s) / Creator(s):: Ahmad, Khalid; Cecka, Cris; Garland, Michael; Hall, Mary

Publisher / Repository:: ACM Transactions on Architecture and Code Generation

Date Published:: 2024-03-31

Journal Name:: ACM Transactions on Architecture and Code Optimization

Volume:: 21

Issue:: 1

ISSN:: 1544-3566

Page Range / eLocation ID:: 1 to 20

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Journal Article:
https://doi.org/10.1145/3633462

More Like this