A review of cloud computing and storage in seismology

Ni, Yiyu (ORCID:0000000151819700); Denolle, Marine_A (ORCID:0000000216102250); Münchmeyer, Jannes (ORCID:0000000240069673); Wang, Yinzhi (ORCID:0000000185050223); Feng, Kuan-Fu; Garcia Jurado Suarez, Carlos; Thomas, Amanda_M; Trabant, Chad; Hamilton, Alex; Mencin, David

doi:10.1093/gji/ggaf322

SUMMARY Seismology has entered the petabyte era, driven by decades of continuous recordings of broad-band networks, the increase in nodal seismic experiments and the recent emergence of distributed acoustic sensing (DAS). This review explains how cloud platforms, by providing object storage, elastic compute and managed data bases, enable researchers to ‘bring the code to the data,’ thereby providing a scalable option to overcome traditional HPC solutions’ bandwidth and capacity limitations. After literature reviews of cloud concepts and their research applications in seismology, we illustrate the capacities of cloud-native workflows using two canonical end-to-end demonstrations: (1) ambient noise seismology that calculates cross-correlation functions at scale, and (2) earthquake detection and phase picking. Both workflows utilize Amazon Web Services, a commercial cloud platform for streaming I/O and provenance, demonstrating that cloud throughput can rival on-premises HPC at comparable costs, scanning 100 TBs to 1.3 PBs of seismic data in a few hours or days of processing. The review also discusses research and education initiatives, the reproducibility benefits of containers and cost pitfalls (e.g. egress, I/O fees) of energy-intensive seismological research computing. While designing cloud pipelines remains non-trivial, partnerships with research software engineers enable converting domain code into scalable, automated and environmentally conscious solutions for next-generation seismology. We also outline where cloud resources fall short of specialized HPC—most notably for tightly coupled petascale simulations and long-term, PB-scale archives—so that practitioners can make informed, cost-effective choices.

More Like this