NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Opening doors to physical sample tracking and attribution in Earth and environmental sciences

https://doi.org/10.1038/s41597-025-05295-z

Damerow, Joan_E; Raia, Natalie_H; Stanley, Val; Choe, Saebyul; Borton, Mikayla_A; Byers, Neil; Cassidy, Ellen_R; Cholia, Shreyas; Edmunds, Rorie; Forbes, Brieanne; et al (June 2025, Scientific Data)
Opening Doors to Physical Sample Data Discovery, Integration, and Credit

https://doi.org/10.31223/X5ST2K

Damerow, Joan; Raia, Natalie; Stanley, Val; Choe, Saebyul; Borton, Mikayla; Byers, Neil; Cassidy, Ellen; Cholia, Shreyas; Edmunds, Rorie; Forbes, Brieanne; et al (June 2024, Nature Scientific Data)

Physical samples and their associated (meta)data underpin scientific discoveries across disciplines, and can enable new science when appropriately archived. However, there are significant gaps in community practices and infrastructure that currently prevent accurate provenance tracking, reproducibility, and attribution. For the vast majority of samples, descriptive metadata is often sparse, inaccessible, or absent. Samples and associated (meta)data may also be scattered across numerous physical collections, data repositories, laboratories, data files, and papers with no clear linkages or provenance tracking as new information is generated over time. The Physical Samples Curation Cluster has therefore developed ‘A Scientific Author Guide for Publishing Open Research Using Physical Samples.’ This involved synthesizing existing practices, community feedback, and assessing real-world examples to identify community and infrastructure needs. We identified areas of work needed to enable authors to efficiently reference samples and related data, link related samples and data, and track their use. Our goal is to help improve the discoverability, interoperability, use of physical samples and associated (meta)data into the future.
more » « less
Full Text Available
Towards Interactive, Reproducible Analytics at Scale on HPC Systems

https://doi.org/10.1109/UrgentHPC51945.2020.00011

Cholia, Shreyas; Heagy, Lindsey; Henderson, Matthew; Paine, Drew; Hays, Jon; Bianchi, Ludovico; Ghoshal, Devarshi; Perez, Fernando; Ramakrishnan, Lavanya (November 2020, 2020 IEEE/ACM HPC for Urgent Decision Making (UrgentHPC))
null (Ed.)
The growth in scientific data volumes has resulted in a need to scale up processing and analysis pipelines using High Performance Computing (HPC) systems. These workflows need interactive, reproducible analytics at scale. The Jupyter platform provides core capabilities for interactivity but was not designed for HPC systems. In this paper, we outline our efforts that bring together core technologies based on the Jupyter Platform to create interactive, reproducible analytics at scale on HPC systems. Our work is grounded in a real world science use case - applying geophysical simulations and inversions for imaging the subsurface. Our core platform addresses three key areas of the scientific analysis workflow - reproducibility, scalability, and interactivity. We describe our implemention of a system, using Binder, Science Capsule, and Dask software. We demonstrate the use of this software to run our use case and interactively visualize real-time streams of HDF5 data.
more » « less
Full Text Available

Search for: All records