NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

STZ: A High Quality and High Speed Streaming Lossy Compression Framework for Scientific Data

https://doi.org/10.1145/3712285.3759795

Wang, Daoce; Grosset, Pascal; Pulido, Jesus; Tian, Jiannan; Athawale, Tushar; Jia, Jinda; Sun, Baixi; Zhang, Boyuan; Jin, Sian; Zhao, Kai; et al (November 2025, ACM/IEEE)

Full Text Available
BMQSim: Overcoming Memory Constraints in Quantum Circuit Simulation with a High-Fidelity Compression Framework

Zhang, Boyuan; Fang, Bo; Ye, Fanjiang; Guo, Luanzheng; Song, Fengguang; Tallent, Nathan; Tao, Dingwen (June 2025, ACM)

Full Text Available
Pushing the Limits of GPU Lossy Compression: A Hierarchical Delta Approach

Zhang, Boyuan; Huang, Yafan; Di, Sheng; Song, Fengguang; Li, Guanpeng; Cappello, Franck (June 2025, ICS'25, ACM)

Full Text Available
COMPSO: Optimizing Gradient Compression for Distributed Training with Second-Order Optimizers

https://doi.org/10.1145/3710848.3710852

Sun, Baixi; Liu, Weijin; Pauloski, J Gregory; Tian, Jiannan; Jia, Jinda; Wang, Daoce; Zhang, Boyuan; Zheng, Mingkai; Di, Sheng; Jin, Sian; et al (February 2025, ACM)

Full Text Available
A High-Quality Workflow for Multi-Resolution Scientific Data Reduction and Visualization

https://doi.org/10.1109/SC41406.2024.00091

Wang, Daoce; Grosset, Pascal; Pulido, Jesus; Athawale, Tushar M; Tian, Jiannan; Zhao, Kai; Lukić, Zarija; Huebl, Axel; Wang, Zhe; Ahrens, James; et al (November 2024, IEEE)

Full Text Available
Concealing Compression-accelerated I/O for HPC Applications through In Situ Task Scheduling

https://doi.org/10.1145/3627703.3629573

Jin, Sian; Di, Sheng; Vivien, Frédéric; Wang, Daoce; Robert, Yves; Tao, Dingwen; Cappello, Franck (April 2024, ACM)

Full Text Available
AMRIC: A Novel In Situ Lossy Compression Framework for Efficient I/O in Adaptive Mesh Refinement Applications

https://doi.org/10.1145/3581784.3613212

Wang, Daoce; Pulido, Jesus; Grosset, Pascal; Tian, Jiannan; Jin, Sian; Tang, Houjun; Sexton, Jean; Di, Sheng; Zhao, Kai; Fang, Bo; et al (November 2023, ACM)
GPULZ: Optimizing LZSS Lossless Compression for Multi-byte Data on Modern GPUs

Zhang, Boyuan; Tian, Jiannan; Di, Sheng; Yu, Xiaodong; Swany, Martin; Tao, Dingwen; Cappello, Franck (June 2023, The 37th ACM International Conference on Supercomputing (ICS 2023))

Full Text Available
FZ-GPU: A Fast and High-Ratio Lossy Compressor for Scientific Computing Applications on GPUs

https://doi.org/10.1145/3588195.3592994

Zhang, Boyuan; Tian, Jiannan; Di, Sheng; Yu, Xiaodong; Feng, Yunhe; Liang, Xin; Tao, Dingwen; Cappello, Franck (June 2023, The 32nd ACM International Symposium on High-Performance Parallel and Distributed Computing (HPDC 2023))

Today’s large-scale scientific applications running on high-performance computing (HPC) systems generate vast data volumes. Thus, data compression is becoming a critical technique to mitigate the storage burden and data-movement cost. However, existing lossy compressors for scientific data cannot achieve a high compression ratio and throughput simultaneously, hindering their adoption in many applications requiring fast compression, such as in-memory compression. To this end, in this work, we develop a fast and high-ratio error-bounded lossy compressor on GPUs for scientific data (called FZ-GPU). Specifically, we first design a new compression pipeline that consists of fully parallelized quantization, bitshuffle, and our newly designed fast encoding. Then, we propose a series of deep architectural optimizations for each kernel in the pipeline to take full advantage of CUDA architectures. We propose a warp-level optimization to avoid data conflicts for bit-wise operations in bitshuffle, maximize shared memory utilization, and eliminate unnecessary data movements by fusing different compression kernels. Finally, we evaluate FZ-GPU on two NVIDIA GPUs (i.e., A100 and RTX A4000) using six representative scientific datasets from SDRBench. Results on the A100 GPU show that FZ-GPU achieves an average speedup of 4.2× over cuSZ and an average speedup of 37.0× over a multi-threaded CPU implementation of our algorithm under the same error bound. FZ-GPU also achieves an average speedup of 2.3× and an average compression ratio improvement of 2.0× over cuZFP under the same data distortion.
more » « less
Full Text Available
SZ3: A Modular Framework for Composing Prediction-Based Error-Bounded Lossy Compressors

https://doi.org/10.1109/TBDATA.2022.3201176

Liang, Xin; Zhao, Kai; Di, Sheng; Li, Sihuan; Underwood, Robert; Gok, Ali M.; Tian, Jiannan; Deng, Junjing; Calhoun, Jon C.; Tao, Dingwen; et al (April 2023, IEEE Transactions on Big Data)

Full Text Available

« Prev Next »

Search for: All records