NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

GPULZ: Optimizing LZSS Lossless Compression for Multi-byte Data on Modern GPUs

Zhang, Boyuan; Tian, Jiannan; Di, Sheng; Yu, Xiaodong; Swany, Martin; Tao, Dingwen; Cappello, Franck (June 2023, The 37th ACM International Conference on Supercomputing (ICS 2023))

Full Text Available
FZ-GPU: A Fast and High-Ratio Lossy Compressor for Scientific Computing Applications on GPUs

https://doi.org/10.1145/3588195.3592994

Zhang, Boyuan; Tian, Jiannan; Di, Sheng; Yu, Xiaodong; Feng, Yunhe; Liang, Xin; Tao, Dingwen; Cappello, Franck (June 2023, The 32nd ACM International Symposium on High-Performance Parallel and Distributed Computing (HPDC 2023))

Today’s large-scale scientific applications running on high-performance computing (HPC) systems generate vast data volumes. Thus, data compression is becoming a critical technique to mitigate the storage burden and data-movement cost. However, existing lossy compressors for scientific data cannot achieve a high compression ratio and throughput simultaneously, hindering their adoption in many applications requiring fast compression, such as in-memory compression. To this end, in this work, we develop a fast and high-ratio error-bounded lossy compressor on GPUs for scientific data (called FZ-GPU). Specifically, we first design a new compression pipeline that consists of fully parallelized quantization, bitshuffle, and our newly designed fast encoding. Then, we propose a series of deep architectural optimizations for each kernel in the pipeline to take full advantage of CUDA architectures. We propose a warp-level optimization to avoid data conflicts for bit-wise operations in bitshuffle, maximize shared memory utilization, and eliminate unnecessary data movements by fusing different compression kernels. Finally, we evaluate FZ-GPU on two NVIDIA GPUs (i.e., A100 and RTX A4000) using six representative scientific datasets from SDRBench. Results on the A100 GPU show that FZ-GPU achieves an average speedup of 4.2× over cuSZ and an average speedup of 37.0× over a multi-threaded CPU implementation of our algorithm under the same error bound. FZ-GPU also achieves an average speedup of 2.3× and an average compression ratio improvement of 2.0× over cuZFP under the same data distortion.
more » « less
Full Text Available
SZ3: A Modular Framework for Composing Prediction-Based Error-Bounded Lossy Compressors

https://doi.org/10.1109/TBDATA.2022.3201176

Liang, Xin; Zhao, Kai; Di, Sheng; Li, Sihuan; Underwood, Robert; Gok, Ali M.; Tian, Jiannan; Deng, Junjing; Calhoun, Jon C.; Tao, Dingwen; et al (April 2023, IEEE Transactions on Big Data)

Full Text Available
Optimizing Error-Bounded Lossy Compression for Scientific Data With Diverse Constraints

https://doi.org/10.1109/TPDS.2022.3194695

Liu, Yuanjian; Di, Sheng; Zhao, Kai; Jin, Sian; Wang, Cheng; Chard, Kyle; Tao, Dingwen; Foster, Ian; Cappello, Franck (July 2022, IEEE Transactions on Parallel and Distributed Systems)

Vast volumes of data are produced by today’s scientific simulations and advanced instruments. These data cannot be stored and transferred efficiently because of limited I/O bandwidth, network speed, and storage capacity. Error-bounded lossy compression can be an effective method for addressing these issues: not only can it significantly reduce data size, but it can also control the data distortion based on user-defined error bounds. In practice, many scientific applications have specific requirements or constraints for lossy compression, in order to guarantee that the reconstructed data are valid for post hoc analysis. For example, some datasets contain irrelevant data that should be isolated in particular and users often have intuition regarding value ranges, geospatial regions, and other data subsets that are crucial for subsequent analysis. Existing state-of-the-art error-bounded lossy compressors, however, do not consider these constraints during compression, resulting in inferior compression ratios with respect to user’s post hoc analysis, due to the fact that the data itself provides little or no value for post hoc analysis. In this work we address this issue by proposing an optimized framework that can preserve diverse constraints during the error-bounded lossy compression, e.g., cleaning the irrelevant data, efficiently preserving different precision for multiple value intervals, and allowing users to set diverse precision over both regular and irregular regions. We perform our evaluation on a supercomputer with up to 2,100 cores. Experiments with six real-world applications show that our proposed diverse constraints based error-bounded lossy compressor can obtain a higher visual quality or data fidelity on reconstructed data with the same or even higher compression ratios compared with the traditional state-of-the-art compressor SZ. Our experiments also demonstrate very good scalability in compression performance compared with the I/O throughput of the parallel file system.
more » « less
Full Text Available
TAC: Optimizing Error-Bounded Lossy Compression for Three-Dimensional Adaptive Mesh Refinement Simulations

https://doi.org/10.1145/3502181.3531458

Wang, Daoce; Pulido, Jesus; Grosset, Pascal; Jin, Sian; Tian, Jiannan; Ahrens, James; Tao, Dingwen (June 2022, The 31st ACM International Symposium on High-Performance Parallel and Distributed Computing (HPDC 2022))

Today's scientific simulations require a significant reduction of data volume because of extremely large amounts of data they produce and the limited I/O bandwidth and storage space. Error-bounded lossy compression has been considered one of the most effective solutions to the above problem. However, little work has been done to improve error-bounded lossy compression for Adaptive Mesh Refinement (AMR) simulation data. Unlike the previous work that only leverages 1D compression, in this work, we propose to leverage high-dimensional (e.g., 3D) compression for each refinement level of AMR data. To remove the data redundancy across different levels, we propose three pre-process strategies and adaptively use them based on the data characteristics. Experiments on seven AMR datasets from a real-world large-scale AMR simulation demonstrate that our proposed approach can improve the compression ratio by up to 3.3X under the same data distortion, compared to the state-of-the-art method. In addition, we leverage the flexibility of our approach to tune the error bound for each level, which achieves much lower data distortion on two application-specific metrics.
more » « less
Full Text Available
Ultrafast Error-Bounded Lossy Compression for Scientific Datasets

https://doi.org/10.1145/3502181.3531473

Yu, Xiaodong; Di, Sheng; Zhao, Kai; Tian, Jiannan; Tao, Dingwen; Liang, Xin; Cappello, Franck (June 2022, The 31st ACM International Symposium on High-Performance Parallel and Distributed Computing (HPDC 2022))

Today’s scientific high-performance computing applications and advanced instruments are producing vast volumes of data across a wide range of domains, which impose a serious burden on data transfer and storage. Error-bounded lossy compression has been developed and widely used in the scientific community because it not only can significantly reduce the data volumes but also can strictly control the data distortion based on the user-specified error bound. Existing lossy compressors, however, cannot offer ultrafast compression speed, which is highly demanded by numerous applications or use cases (such as in-memory compression and online instrument data compression). In this paper we propose a novel ultrafast error-bounded lossy compressor that can obtain fairly high compression performance on both CPUs and GPUs and with reasonably high compression ratios. The key contributions are threefold. (1) We propose a generic error-bounded lossy compression framework—called SZx—that achieves ultrafast performance through its novel design comprising only lightweight operations such as bitwise and addition/subtraction operations, while still keeping a high compression ratio. (2) We implement SZx on both CPUs and GPUs and optimize the performance according to their architectures. (3) We perform a comprehensive evaluation with six real-world production-level scientific datasets on both CPUs and GPUs. Experiments show that SZx is 2∼16× faster than the second-fastest existing error-bounded lossy compressor (either SZ or ZFP) on CPUs and GPUs, with respect to both compression and decompression.
more » « less
Full Text Available
cuZ-Checker: A GPU-Based Ultra-Fast Assessment System for Lossy Compressions

https://doi.org/10.1109/Cluster48925.2021.00065

Yu, Xiaodong; Di, Sheng; Gok, Ali Murat; Tao, Dingwen; Cappello, Franck (September 2021, 2021 IEEE International Conference on Cluster Computing (CLUSTER 2021))
null (Ed.)
Full Text Available
Optimizing Error-Bounded Lossy Compression for Scientific Data on GPUs

https://doi.org/10.1109/Cluster48925.2021.00047

Tian, Jiannan; Di, Sheng; Yu, Xiaodong; Rivera, Cody; Zhao, Kai; Jin, Sian; Feng, Yunhe; Liang, Xin; Tao, Dingwen; Cappello, Franck (September 2021, 2021 IEEE International Conference on Cluster Computing (CLUSTER 2021))

Full Text Available

Search for: All records