skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on January 1, 2026

Title: Deep Ocean Early-Career Researchers Dive into Data Management Training
Open access to scientific data is increasingly recognized as critical to fostering scientific progress, trustworthy and reproducible science, global information equity, and evidence-based policymaking. It requires scientists to not only share their data, but to share in such a way that the data have high utility for later users. The FAIR data principles define a set of characteristics for making data “findable, accessible, interoperable, and reusable” (Wilkinson et al., 2016). Training scientists, particularly early-career scientists, on these principles can improve the volume and quality of open science data.  more » « less
Award ID(s):
2318309 2114717
PAR ID:
10599688
Author(s) / Creator(s):
; ;
Corporate Creator(s):
Publisher / Repository:
The Oceanography Society
Date Published:
Journal Name:
Oceanography
ISSN:
1042-8275
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The management of security credentials (e.g., passwords, secret keys) for computational science workflows is a burden for scientists and information security officers. Problems with credentials (e.g., expiration, privilege mismatch) cause workflows to fail to fetch needed input data or store valuable scientific results, distracting scientists from their research by requiring them to diagnose the problems, re-run their computations, and wait longer for their results. SciTokens introduces a capabilities-based authorization infrastructure for distributed scientific computing, to help scientists manage their security credentials more reliably and securely. SciTokens uses IETF-standard OAuth JSON Web Tokens for capability-based secure access to remote scientific data. These access tokens convey the specific authorizations needed by the workflows, rather than general-purpose authentication impersonation credentials, to address the risks of scientific workflows running on distributed infrastructure including NSF resources (e.g., LIGO Data Grid, Open Science Grid, XSEDE) and public clouds (e.g., Amazon Web Services, Google Cloud, Microsoft Azure). By improving the interoperability and security of scientific workflows, SciTokens 1) enables use of distributed computing for scientific domains that require greater data protection and 2) enables use of more widely distributed computing resources by reducing the risk of credential abuse on remote systems. In this extended abstract, we present the results over the past year of our open source implementation of the SciTokens model and its deployment in the Open Science Grid, including new OAuth support added in the HTCondor 8.8 release series. 
    more » « less
  2. With the growing availability and accessibility of big data in ecology, we face an urgent need to train the next generation of scientists in data science practices and tools. One of the biggest barriers for implementing a data-driven curriculum in undergraduate classrooms is the lack of training and support for educators to develop their own skills and time to incorporate these principles into existing courses or develop new ones. Alongside the research goals of the National Ecological Observatory Network (NEON), providing education and training are key components for building a community of scientists and users equipped to utilize large-scale ecological and environmental data. To address this need, the NEON Data Education Fellows program formed as a collaborative Faculty Mentoring Network (FMN) between scientists from NEON and university faculty interested in using NEON data and resources in their ecology classrooms. Like other FMNs, this group has two main goals: 1) to provide tools, resources, and support for faculty interested in developing data-driven curriculum, and (2) to make teaching materials that have been implemented and tested in the classroom available as open educational resources for other educators. We hosted this program using an open education and collaboration platform from the Quantitative Undergraduate Biology Education and Synthesis (QUBES) project. Here, we share lessons learned from facilitating five FMN cohorts and emphasize the successes, pitfalls, and opportunities for developing open education resources through community-driven collaborations. 
    more » « less
  3. Abstract Students lose interest in science as they progress from elementary to high school. There is a need for authentic, place‐based science learning experiences that can increase students' interest in science. Scientists have unique skillsets that can complement the work of educators to create exciting experiences that are grounded in pedagogy and science practices. As scientists and educators, we co‐developed a lesson plan for high school students on the Eastern Shore of Virginia, a historically underserved coastal area, that demonstrated realistic scientific practices in students' local estuaries. After implementation of the lesson plan, we observed that students had a deeper understanding of ecosystem processes compared to their peers who had not been involved, were enthusiastic about sharing their experiences, and had a more well‐rounded ability to think like a scientist than before the lesson plan. We share our experiences and five best practices that can serve as a framework for scientists and educators who are motivated to do similar work. Through collaboration, scientists and educators have the potential to bolster student science identities and increase student participation in future scientific endeavors. 
    more » « less
  4. The management of security credentials (e.g., passwords, secret keys) for computational science workflows is a burden for scientists and information security officers. Problems with credentials (e.g., expiration, privilege mismatch) cause workflows to fail to fetch needed input data or store valuable scientific results, distracting scientists from their research by requiring them to diagnose the problems, re-run their computations, and wait longer for their results. In this paper, we introduce SciTokens, open source software to help scientists manage their security credentials more reliably and securely. We describe the SciTokens system architecture, design, and implementation addressing use cases from the Laser Interferometer Gravitational-Wave Observatory (LIGO) Scientific Collaboration and the Large Synoptic Survey Telescope (LSST) projects. We also present our integration with widely-used software that supports distributed scientific computing, including HTCondor, CVMFS, and XrootD. SciTokens uses IETF-standard OAuth tokens for capability-based secure access to remote scientific data. The access tokens convey the specific authorizations needed by the workflows, rather than general-purpose authentication impersonation credentials, to address the risks of scientific workflows running on distributed infrastructure including NSF resources (e.g., LIGO Data Grid, Open Science Grid, XSEDE) and public clouds (e.g., Amazon Web Services, Google Cloud, Microsoft Azure). By improving the interoperability and security of scientific workflows, SciTokens 1) enables use of distributed computing for scientific domains that require greater data protection and 2) enables use of more widely distributed computing resources by reducing the risk of credential abuse on remote systems. 
    more » « less
  5. Abstract Advances in agricultural genetic, genomic, and breeding (GGB) technologies generate increasingly large and complex datasets that need to be adequately managed and shared. While several agricultural biological databases maintain and curate GGB data, not all scientists are aware of them and how they can be used to access and share data. In addition, there is the need to increase scientists’ awareness that appropriate data archiving and curation increases data longevity and value and bolsters scientific discoveries’ reproducibility and transparency. The AgBioData Education working group aims to address these unmet needs and developed a modular curriculum for educators teaching the basics of biological databases and the findable, accessible, interoperable, and reusable (FAIR) principles to undergraduate and graduate students (https://www.agbiodata.org/). The present paper provides an overview of the topics covered within the curriculum, called ‘AgBioData Curriculum for Ag FAIR Data,’ its audience and modalities, and how it will positively impact all the different stakeholders of the agricultural database ecosystem. We hope the modular curriculum presented here can help scientists and students understand and support database use in all aspects of improving our global food system. Database URL: https://zenodo.org/records/14278084 
    more » « less