Learning from Optimizing Matrix-Matrix Multiplication

Parikh, Devangi N.; Huang, Jianyu; Myers, Margaret E.; van de Geijn, Robert A.

Citation Details

We describe a learning process that uses one of the simplest examples, matrix-matrix multiplication, to illustrate issues that underlie parallel high-performance computing. It is accessible at multiple levels: simple enough to use early in a curriculum yet rich enough to benefit a more advanced software developer. A carefully designed and scaffolded set of exercises leads the learner from a naive implementation towards one that extracts parallelism at multiple levels, ranging from instruction level parallelism to multithreaded parallelism via OpenMP to distributed memory parallelism using MPI. The importance of effectively leveraging the memory hierarchy within and across nodes is exposed, as do the GotoBLAS and SUMMA algorithms. These materials will become part of a Massive Open Online Course (MOOC) to be offered in the future. more »

Award ID(s):: 1714091

PAR ID:: 10073894

Author(s) / Creator(s):: Parikh, Devangi N.; Huang, Jianyu; Myers, Margaret E.; van de Geijn, Robert A.

Date Published:: 2018-05-21

Journal Name:: NSF/TCPP Workshop on Parallel and Distributed Computing Education (EduPar-18)

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.

More Like this