skip to main content


This content will become publicly available on January 2, 2025

Title: Impact of Lossy Compression Errors on Passive Seismic Data Analyses
Abstract

New technologies such as low-cost nodes and distributed acoustic sensing (DAS) are making it easier to continuously collect broadband, high-density seismic monitoring data. To reduce the time to move data from the field to computing centers, reduce archival requirements, and speed up interactive data analysis and visualization, we are motivated to investigate the use of lossy compression on passive seismic array data. In particular, there is a need to not only just quantify the errors in the raw data but also the characteristics of the spectra of these errors and the extent to which these errors propagate into results such as detectability and arrival-time picks of microseismic events. We compare three types of lossy compression: sparse thresholded wavelet compression, zfp compression, and low-rank singular value decomposition compression. We apply these techniques to compare compression schemes on two publicly available datasets: an urban dark fiber DAS experiment and a surface DAS array above a geothermal field. We find that depending on the level of compression needed and the importance of preserving large versus small seismic events, different compression schemes are preferable.

 
more » « less
Award ID(s):
2227018
PAR ID:
10520109
Author(s) / Creator(s):
;
Publisher / Repository:
Seismological Society of America
Date Published:
Journal Name:
Seismological Research Letters
Volume:
95
Issue:
3
ISSN:
0895-0695
Page Range / eLocation ID:
1675 to 1686
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Geolocalization of distributed acoustic sensing (DAS) array channels represents a crucial step whenever the technology is deployed in the field. Commonly, the geolocalization is performed using point-wise active-source experiments, known as tap tests, conducted in the vicinity of the recording fiber. However, these controlled-source experiments are time consuming and greatly diminish the ability to promptly deploy such systems, especially for large-scale DAS experiments. We present a geolocalization methodology for DAS instrumentation that relies on seismic signals generated by a geotracked vehicle. We demonstrate the efficacy of our workflow by geolocating the channels of two DAS systems recording data on dark fibers stretching approximately 100 km within the Long Valley caldera area in eastern California. Our procedure permits the prompt calibration of DAS channel locations for seismic-related applications such as seismic hazard assessment, urban-noise monitoring, wavespeed inversion, and earthquake engineering. We share the developed set of codes along with a tutorial guiding users through the entire mapping process. 
    more » « less
  2. SUMMARY

    In this study, I demonstrate that distributed acoustic sensing (DAS) raw strain rate data can directly be used to estimate spectral source parameters through an Empirical Green's Function (EGF) deconvolution analysis. Previously, DAS had been widely used in passive seismology to image the subsurface and analyze ground motion variations by converting strain or strain rate to particle velocity or acceleration prior to analysis. In this study, spectral analysis is applied to the PoroTomo joint DAS and seismic Nodal array in the Brady Hot Springs geothermal field to obtain source parameters for two M4 earthquakes via EGF analysis, where nearly collocated smaller events are used as an EGF to remove path and site effects. The EGF workflow is applied to raw DAS strain rate data without conversion to particle velocities and raw Nodal seismic data. The DAS and Nodal results are very consistent with similar features of spectral ratios, corner frequencies and moment ratios for the same event pairs. The uncertainty due to stacked spectral measurement is much lower on the DAS array, suggesting better stability of spectral shape measurement, possibly due to the much denser spatial sampling. The uncertainty due to model fitting is similar between DAS and Nodal arrays with slightly lower uncertainty on the DAS array. These observations demonstrate potential for directly using the strain rate measurements from DAS arrays for earthquake source characterizations.

     
    more » « less
  3. Lossy compression algorithms are effective tools to reduce the size of high-performance computing data sets. As established lossy compressors such as SZ and ZFP evolve, they seek to improve the compression/decompression bandwidth and the compression ratio. Algorithm improvements may alter the spatial distribution of errors in the compressed data even when using the same error bound and error bound type. If HPC applications are to compute on lossy compressed data, application users require an understanding of how the performance and spatial distribution of error changes. We explore how spatial distributions of error, compression/decompression bandwidth, and compression ratio change for HPC data sets from the applications PlasComCM and Nek5000 between various versions of SZ and ZFP. In addition, we explore how the spatial distribution of error impacts application correctness when restarting from lossy compressed checkpoints. We verify that known approaches to selecting error tolerances for lossy compressed checkpointing are robust to compressor selection and in the face of changes in the distribution of error. 
    more » « less
  4. With ever-increasing execution scale of the high performance computing (HPC) applications, vast amount of data are being produced by scientific research every day. Error-bounded lossy compression has been considered a very promising solution to address the big-data issue for scientific applications, because it can significantly reduce the data volume with low time cost meanwhile allowing users to control the compression errors with a specified error bound. The existing error-bounded lossy compressors, however, are all developed based on inflexible designs or compression pipelines, which cannot adapt to diverse compression quality requirements/metrics favored by different application users. In this paper, we propose a novel dynamic quality metric oriented error-bounded lossy compression framework, namely QoZ. The detailed contribution is three fold. (1) We design a novel highly-parameterized multi-level interpolation-based data predictor, which can significantly improve the overall compression quality with the same compressed size. (2) We design the error bounded lossy compression framework QoZ based on the adaptive predictor, which can auto-tune the critical parameters and optimize the compression result according to user-specified quality metrics during online compression. (3) We evaluate QoZ carefully by comparing its compression quality with multiple state-of-the-arts on various real-world scientific application datasets. Experiments show that, compared with the second best lossy compressor, QoZ can achieve up to 70% compression ratio improvement under the same error bound, up to 150% compression ratio improvement under the same PSNR, or up to 270% compression ratio improvement under the same SSIM. 
    more » « less
  5. Recent years have witnessed an upsurge of interest in lossy compression due to its potential to significantly reduce data volume with adequate exploitation of the spatiotemporal properties of IoT datasets. However, striking a balance between compression ratios and data fidelity is challenging, particularly when losing data fidelity impacts downstream data analytics noticeably. In this paper, we propose a lossy prediction model dealing with binary classification analytics tasks to minimize the impact of the error introduced due to lossy compression. We specifically focus on five classification algorithms for frost prediction in agricultural fields allowing preparation by the predictive advisories to provide helpful information for timely services. While our experimental evaluations reaffirm the nature of lossy compressions where allowing higher errors offers higher compression ratios, we also observe that the classification performance in terms of accuracy and F-1 score differs among all the algorithms we evaluated. Specifically, random forest is the best lossy prediction model for classifying frost. Lastly, we show the robustness of the lossy prediction model based on the data fidelity in prediction performance. 
    more » « less