Analyzing the Impact of Lossy Compressor Variability on Checkpointing Scientific Simulations

Triantafyllides, Pavlo; Reza, Tasmia; Calhoun, Jon C.

doi:10.1109/CLUSTER.2019.8891052

Citation Details

Analyzing the Impact of Lossy Compressor Variability on Checkpointing Scientific Simulations

Lossy compression algorithms are effective tools to reduce the size of high-performance computing data sets. As established lossy compressors such as SZ and ZFP evolve, they seek to improve the compression/decompression bandwidth and the compression ratio. Algorithm improvements may alter the spatial distribution of errors in the compressed data even when using the same error bound and error bound type. If HPC applications are to compute on lossy compressed data, application users require an understanding of how the performance and spatial distribution of error changes. We explore how spatial distributions of error, compression/decompression bandwidth, and compression ratio change for HPC data sets from the applications PlasComCM and Nek5000 between various versions of SZ and ZFP. In addition, we explore how the spatial distribution of error impacts application correctness when restarting from lossy compressed checkpoints. We verify that known approaches to selecting error tolerances for lossy compressed checkpointing are robust to compressor selection and in the face of changes in the distribution of error. more »

Award ID(s):: 1910197

PAR ID:: 10193340

Author(s) / Creator(s):: Triantafyllides, Pavlo; Reza, Tasmia; Calhoun, Jon C.

Date Published:: 2019-09-26

Journal Name:: 2019 IEEE International Conference on Cluster Computing (CLUSTER)

Page Range / eLocation ID:: 1 to 5

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.1109/CLUSTER.2019.8891052

More Like this