NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Accelerated dynamic data reduction using spatial and temporal properties

https://doi.org/10.1177/10943420231180504

Hickman Fulp, Megan; Fulp, Dakota; Zou, Changfeng; Sanders, Cooper; Biswas, Ayan; Smith, Melissa C.; Calhoun, Jon C. (September 2023, The International Journal of High Performance Computing Applications)

Due to improvements in high-performance computing (HPC) capabilities, many of today’s applications produce petabytes worth of data, causing bottlenecks within the system. Importance-based sampling methods, including our spatio-temporal hybrid data sampling method, are capable of resolving these bottlenecks. While our hybrid method has been shown to outperform existing methods, its effectiveness relies heavily on user parameters, such as histogram bins, error threshold, or number of regions. Moreover, the throughput it demonstrates must be higher to avoid becoming a bottleneck itself. In this article, we resolve both of these issues. First, we assess the effects of several user input parameters and detail techniques to help determine optimal parameters. Next, we detail and implement accelerated versions of our method using OpenMP and CUDA. Upon analyzing our implementations, we find 9.8× to 31.5× throughput improvements. Next, we demonstrate how our method can accept different base sampling algorithms and the effects these different algorithms have. Finally, we compare our sampling methods to the lossy compressor cuSZ in terms of data preservation and data movement.
more » « less
Full Text Available
Estimating Potential Error in Sampling Interpolation

https://doi.org/10.1109/BigData55660.2022.10020913

Fulp, Megan Hickman; Fulp, Dakota; Calhoun, Jon C. (December 2022, 2022 IEEE International Conference on Big Data (Big Data))

Full Text Available
Exploring the Impacts of Software Cache Configuration for In-line Compressed Arrays

https://doi.org/10.1109/HPEC55821.2022.9926289

Ranjan, Sansriti; Fulp, Dakota; Calhoun, Jon C. (September 2022, 2022 IEEE High Performance Extreme Computing Conference (HPEC))

Full Text Available
ARC: An Automated Approach to Resiliency for Lossy Compressed Data via Error Correcting Codes

https://doi.org/10.1145/3431379.3460638

Fulp, Dakota; Poulos, Alexandra; Underwood, Robert; Calhoun, Jon C. (June 2020, HPDC '21: Proceedings of the 30th International Symposium on High-Performance Parallel and Distributed Computing)
null (Ed.)
Progress in high-performance computing (HPC) systems has led to complex applications that stress the I/O subsystem by creating vast amounts of data. Lossy compression reduces data size considerably, but a single error renders lossy compressed data unusable. This sensitivity stems from the high information content per bit in compressed data and is a critical issue as soft errors that cause bit-flips have become increasingly commonplace in HPC systems. While many works have improved lossy compressor performance, few have sought to address this critical weakness. This paper presents ARC: Automated Resiliency for Compression. Given user-defined constraints on storage, throughput, and resiliency, ARC automatically determines the optimal error-correcting code (ECC) configuration before encoding data. We conduct an extensive fault injection study to fully understand the effects of soft errors on lossy compressed data and how to best protect it. We evaluate ARC's scalability, performance, resiliency, and ease of use. We find on a 40 core node that encoding and decoding demonstrate throughput up to 3730 MB/s and 3602 MB/s. ARC also detects and corrects multi-bit errors with a tunable overhead in terms of storage and throughput. Finally, we display the ease of using ARC and how to consider a systems failure rate when determining the constraints.
more » « less
Full Text Available

Search for: All records