skip to main content


Title: FRaZ: A Generic High-Fidelity Fixed-Ratio Lossy Compression Framework for Scientific Floating-point Data
With ever-increasing volumes of scientific floating-point data being produced by high-performance computing applications, significantly reducing scientific floating-point data size is critical, and error-controlled lossy compressors have been developed for years. None of the existing scientific floating-point lossy data compressors, however, support effective fixed-ratio lossy compression. Yet fixed-ratio lossy compression for scientific floating-point data not only compresses to the requested ratio but also respects a user-specified error bound with higher fidelity. In this paper, we present FRaZ: a generic fixed-ratio lossy compression framework respecting user-specified error constraints. The contribution is twofold. (1) We develop an efficient iterative approach to accurately determine the appropriate error settings for different lossy compressors based on target compression ratios. (2) We perform a thorough performance and accuracy evaluation for our proposed fixed-ratio compression framework with multiple state-of-the-art error-controlled lossy compressors, using several real-world scientific floating-point datasets from different domains. Experiments show that FRaZ effectively identifies the optimum error setting in the entire error setting space of any given lossy compressor. While fixed-ratio lossy compression is slower than fixed-error compression, it provides an important new lossy compression technique for users of very large scientific floating-point datasets.  more » « less
Award ID(s):
1910197 1633608
NSF-PAR ID:
10193356
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS)
Page Range / eLocation ID:
567 to 577
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    With ever-increasing volumes of scientific floating-point data being produced by high-performance computing applications, significantly reducing scientific floating-point data size is critical, and error-controlled lossy compressors have been developed for years. None of the existing scientific floating-point lossy data compressors, however, support effective fixed-ratio lossy compression. Yet fixed-ratio lossy compression for scientific floating-point data not only compresses to the requested ratio but also respects a user-specified error bound with higher fidelity. In this paper, we present FRaZ: a generic fixed-ratio lossy compression framework respecting user-specified error constraints. The contribution is twofold. (1) We develop an efficient iterative approach to accurately determine the appropriate error settings for different lossy compressors based on target compression ratios. (2) We perform a thorough performance 
    more » « less
  2. With ever-increasing execution scale of the high performance computing (HPC) applications, vast amount of data are being produced by scientific research every day. Error-bounded lossy compression has been considered a very promising solution to address the big-data issue for scientific applications, because it can significantly reduce the data volume with low time cost meanwhile allowing users to control the compression errors with a specified error bound. The existing error-bounded lossy compressors, however, are all developed based on inflexible designs or compression pipelines, which cannot adapt to diverse compression quality requirements/metrics favored by different application users. In this paper, we propose a novel dynamic quality metric oriented error-bounded lossy compression framework, namely QoZ. The detailed contribution is three fold. (1) We design a novel highly-parameterized multi-level interpolation-based data predictor, which can significantly improve the overall compression quality with the same compressed size. (2) We design the error bounded lossy compression framework QoZ based on the adaptive predictor, which can auto-tune the critical parameters and optimize the compression result according to user-specified quality metrics during online compression. (3) We evaluate QoZ carefully by comparing its compression quality with multiple state-of-the-arts on various real-world scientific application datasets. Experiments show that, compared with the second best lossy compressor, QoZ can achieve up to 70% compression ratio improvement under the same error bound, up to 150% compression ratio improvement under the same PSNR, or up to 270% compression ratio improvement under the same SSIM. 
    more » « less
  3. High-performance computing (HPC) systems that run scientific simulations of significance produce a large amount of data during runtime. Transferring or storing such big datasets causes a severe I/O bottleneck and a considerable storage burden. Applying compression techniques, particularly lossy compressors, can reduce the size of the data and mitigate such overheads. Unlike lossless compression algorithms, error-controlled lossy compressors could significantly reduce the data size while respecting the user-defined error bound. DCTZ is one of the transform-based lossy compressors with a highly efficient encoding and purpose-built error control mechanism that accomplishes high compression ratios with high data fidelity. However, since DCTZ quantizes the DCT coefficients in the frequency domain, it may only partially control the relative error bound defined by the user. In this paper, we aim to improve the compression quality of DCTZ. Specifically, we propose a preconditioning method based on level offsetting and scaling to control the magnitude of input of the DCTZ framework, thereby enforcing stricter error bounds. We evaluate the performance of our method in terms of compression ratio and rate distortion with real-world HPC datasets. Our experimental result shows that our method can achieve a higher compression ratio than other state-of-the-art lossy compressors with a tighter error bound while precisely guaranteeing the user-defined error bound. 
    more » « less
  4. Today’s scientific high-performance computing applications and advanced instruments are producing vast volumes of data across a wide range of domains, which impose a serious burden on data transfer and storage. Error-bounded lossy compression has been developed and widely used in the scientific community because it not only can significantly reduce the data volumes but also can strictly control the data distortion based on the user-specified error bound. Existing lossy compressors, however, cannot offer ultrafast compression speed, which is highly demanded by numerous applications or use cases (such as in-memory compression and online instrument data compression). In this paper we propose a novel ultrafast error-bounded lossy compressor that can obtain fairly high compression performance on both CPUs and GPUs and with reasonably high compression ratios. The key contributions are threefold. (1) We propose a generic error-bounded lossy compression framework—called SZx—that achieves ultrafast performance through its novel design comprising only lightweight operations such as bitwise and addition/subtraction operations, while still keeping a high compression ratio. (2) We implement SZx on both CPUs and GPUs and optimize the performance according to their architectures. (3) We perform a comprehensive evaluation with six real-world production-level scientific datasets on both CPUs and GPUs. Experiments show that SZx is 2∼16× faster than the second-fastest existing error-bounded lossy compressor (either SZ or ZFP) on CPUs and GPUs, with respect to both compression and decompression. 
    more » « less
  5. Scientific simulations run by high-performance computing (HPC) systems produce a large amount of data, which causes an extreme I/O bottleneck and a huge storage burden. Applying compression techniques can mitigate such overheads through reducing the data size. Unlike traditional lossless compressions, error-controlled lossy compressions, such as SZ, ZFP, and DCTZ, designed for scientists who demand not only high compression ratios but also a guarantee of certain degree of precision, is coming into prominence. While rate-distortion efficiency of recent lossy compressors, especially the DCT-based one, is promising due to its high-compression encoding, the overall coding architecture is still conservative, necessitating the quantization that strikes a balance between different encoding possibilities and varying rate-distortions. In this paper, we aim to improve the performance of DCT-based compressor, namely DCTZ, by optimizing the quantization model and encoding mechanism. Specifically, we propose a bit-efficient quantizer based on the DCTZ framework, develop a unique ordering mechanism based on the quantization table, and extend the encoding index. We evaluate the performance of our optimized DCTZ in terms of rate-distortion using real-world HPC datasets. Our experimental evaluations demonstrate that, on average, our proposed approach can improve the compression ratio of the original DCTZ by 1.38x. Moreover, combined with the extended encoding mechanism, the optimized DCTZ shows a competitive performance with state-of-the-art lossy compressors, SZ and ZFP. 
    more » « less