High-performance Effective Scientific Error-bounded Lossy Compression with Auto-tuned Multi-component Interpolation

Liu, Jinyang; Di, Sheng; Zhao, Kai; Liang, Xin; Jin, Sian; Jian, Zizhe; Huang, Jiajun; Wu, Shixun; Chen, Zizhong; Cappello, Franck

doi:10.1145/3639259

Citation Details

High-performance Effective Scientific Error-bounded Lossy Compression with Auto-tuned Multi-component Interpolation

Error-bounded lossy compression has been identified as a promising solution for significantly reducing scientific data volumes upon users' requirements on data distortion. For the existing scientific error-bounded lossy compressors, some of them (such as SPERR and FAZ) can reach fairly high compression ratios and some others (such as SZx, SZ, and ZFP) feature high compression speeds, but they rarely exhibit both high ratio and high speed meanwhile. In this paper, we propose HPEZ with newly-designed interpolations and quality-metric-driven auto-tuning, which features significantly improved compression quality upon the existing high-performance compressors, meanwhile being exceedingly faster than high-ratio compressors. The key contributions lie as follows: (1) We develop a series of advanced techniques such as interpolation re-ordering, multi-dimensional interpolation, and natural cubic splines to significantly improve compression qualities with interpolation-based data prediction. (2) The auto-tuning module in HPEZ has been carefully designed with novel strategies, including but not limited to block-wise interpolation tuning, dynamic dimension freezing, and Lorenzo tuning. (3) We thoroughly evaluate HPEZ compared with many other compressors on six real-world scientific datasets. Experiments show that HPEZ outperforms other high-performance error-bounded lossy compressors in compression ratio by up to 140% under the same error bound, and by up to 360% under the same PSNR. In parallel data transfer experiments on the distributed database, HPEZ achieves a significant performance gain with up to 40% time cost reduction over the second-best compressor. more »

Award ID(s):: 2104023

PAR ID:: 10524972

Author(s) / Creator(s):: Liu, Jinyang; Di, Sheng; Zhao, Kai; Liang, Xin; Jin, Sian; Jian, Zizhe; Huang, Jiajun; Wu, Shixun; Chen, Zizhong; Cappello, Franck

Publisher / Repository:: ACM

Date Published:: 2024-03-12

Journal Name:: Proceedings of the ACM on Management of Data

Volume:: 2

Issue:: 1

ISSN:: 2836-6573

Page Range / eLocation ID:: 1 to 27

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Journal Article:
https://doi.org/10.1145/3639259

More Like this