skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: PubDAS: A PUBlic Distributed Acoustic Sensing Datasets Repository for Geosciences
Abstract During the past few years, distributed acoustic sensing (DAS) has become an invaluable tool for recording high-fidelity seismic wavefields with great spatiotemporal resolutions. However, the considerable amount of data generated during DAS experiments limits their distribution with the broader scientific community. Such a bottleneck inherently slows down the pursuit of new scientific discoveries in geosciences. Here, we introduce PubDAS—the first large-scale open-source repository where several DAS datasets from multiple experiments are publicly shared. PubDAS currently hosts eight datasets covering a variety of geological settings (e.g., urban centers, underground mines, and seafloor), spanning from several days to several years, offering both continuous and triggered active source recordings, and totaling up to ∼90 TB of data. This article describes these datasets, their metadata, and how to access and download them. Some of these datasets have only been shallowly explored, leaving the door open for new discoveries in Earth sciences and beyond.  more » « less
Award ID(s):
2022716
PAR ID:
10437086
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ; ; ; ; ;
Date Published:
Journal Name:
Seismological Research Letters
Volume:
94
Issue:
2A
ISSN:
0895-0695
Page Range / eLocation ID:
983 to 998
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. In the past decade, distributed acoustic sensing (DAS) has enabled many new monitoring applications in diverse fields including hydrocarbon exploration and extraction; induced, local, regional, and global seismology; infrastructure and urban monitoring; and several others. However, to date, the open-source software ecosystem for handling DAS data is relatively immature. Here we introduce DASCore, a Python library for analyzing, visualizing, and managing DAS data. DASCore implements an object-oriented interface for performing common data processing and transformations, reading and writing various DAS file types, creating simple visualizations, and managing file system-based DAS archives. DASCore also integrates with other Python-based tools which enable the processing of massive data sets in cloud environments. DASCore is the foundational package for the broader DAS data analysis ecosystem (DASDAE), and as such its main goal is to facilitate the development of other DAS libraries and applications. 
    more » « less
  2. Material characterization techniques are widely used to characterize the physical and chemical properties of materials at the nanoscale and, thus, play central roles in material scientific discoveries. However, the large and complex datasets generated by these techniques often require significant human effort to interpret and extract meaningful physicochemical insights. Artificial intelligence (AI) techniques such as machine learning (ML) have the potential to improve the efficiency and accuracy of surface analysis by automating data analysis and interpretation. In this perspective paper, we review the current role of AI in surface analysis and discuss its future potential to accelerate discoveries in surface science, materials science, and interface science. We highlight several applications where AI has already been used to analyze surface analysis data, including the identification of crystal structures from XRD data, analysis of XPS spectra for surface composition, and the interpretation of TEM and SEM images for particle morphology and size. We also discuss the challenges and opportunities associated with the integration of AI into surface analysis workflows. These include the need for large and diverse datasets for training ML models, the importance of feature selection and representation, and the potential for ML to enable new insights and discoveries by identifying patterns and relationships in complex datasets. Most importantly, AI analyzed data must not just find the best mathematical description of the data, but it must find the most physical and chemically meaningful results. In addition, the need for reproducibility in scientific research has become increasingly important in recent years. The advancement of AI, including both conventional and the increasing popular deep learning, is showing promise in addressing those challenges by enabling the execution and verification of scientific progress. By training models on large experimental datasets and providing automated analysis and data interpretation, AI can help to ensure that scientific results are reproducible and reliable. Although integration of knowledge and AI models must be considered for the transparency and interpretability of models, the incorporation of AI into the data collection and processing workflow will significantly enhance the efficiency and accuracy of various surface analysis techniques and deepen our understanding at an accelerated pace. 
    more » « less
  3. The study of seabirds can provide a fascinating subject for the integration of datasets and data practices with scientific phenomena. Workshop participants will examine trends and correlations in several decades of National Audubon Society data about puffins, using an accessible open-source education data tool (CODAP). They will examine relationships among variables including sea surface temperature, fish in the puffin diet, fledgling weight, and survival to breeding age. They will use present-day data from puffin webcams and sound recordings to supplement their work with historical datasets. They will train an artificial intelligence (AI) system to differentiate puffin vocalizations from those of other birds and puffin images from other bird images. 
    more » « less
  4. Abstract Geologic collections play a fundamental role in advancing scientific discoveries, offering rich global archives reaching deep into Earth's past. Realizing the scientific potential of these remarkable resources will require the generation and curation of usable data to facilitate open science and data synthesis efforts. Although core scanners offer an efficient, nondestructive way to acquire cm‐resolution data sets on archived cores, material properties may have been altered over time, making comparisons difficult. To assess the promise of core‐scanner measurements to support studies using decades‐old cores, we scanned the 1961 Project Mohole cores, which were the first cores obtained by deep‐water scientific ocean drilling. We examined cores from the Experimental Mohole Guadalupe site with new X‐ray fluorescence, magnetic susceptibility, and line scan camera measurements, and used the new data to re‐evaluate measurements made up to 64 years ago. We show that new measurements can validate and enhance the original analyses performed on the cores, and that even cores from the dawn of scientific ocean drilling retain valuable information waiting to be retrieved. 
    more » « less
  5. Abstract In pursuit of scientific discovery, vast collections of unstructured structural and functional images are acquired; however, only an infinitesimally small fraction of this data is rigorously analyzed, with an even smaller fraction ever being published. One method to accelerate scientific discovery is to extract more insight from costly scientific experiments already conducted. Unfortunately, data from scientific experiments tend only to be accessible by the originator who knows the experiments and directives. Moreover, there are no robust methods to search unstructured databases of images to deduce correlations and insight. Here, we develop a machine learning approach to create image similarity projections to search unstructured image databases. To improve these projections, we develop and train a model to include symmetry-aware features. As an exemplar, we use a set of 25,133 piezoresponse force microscopy images collected on diverse materials systems over five years. We demonstrate how this tool can be used for interactive recursive image searching and exploration, highlighting structural similarities at various length scales. This tool justifies continued investment in federated scientific databases with standardized metadata schemas where the combination of filtering and recursive interactive searching can uncover synthesis-structure-property relations. We provide a customizable open-source package ( https://github.com/m3-learning/Recursive_Symmetry_Aware_Materials_Microstructure_Explorer ) of this interactive tool for researchers to use with their data. 
    more » « less