This perspective article presents the vision of combining findable, accessible, interoperable, and reusable (FAIR) Digital Objects with the National Science Data Fabric (NSDF) to enhance data accessibility, scientific discovery, and education. Integrating FAIR Digital Objects into the NSDF overcomes data access barriers and facilitates the extraction of machine-actionable metadata in alignment with FAIR principles. The article discusses examples of climate simulations and materials science workflows and establishes the groundwork for a dataflow design that prioritizes inclusivity, web-centricity, and a network-first approach to democratize data access and create opportunities for research and collaboration in the scientific community.
more »
« less
Centralized project-specific metadata platforms: toolkit provides new perspectives on open data management within multi-institution and multidisciplinary research projects
Abstract Open science and open data within scholarly research programs are growing both in popularity and by requirement from grant funding agencies and journal publishers. A central component of open data management, especially on collaborative, multidisciplinary, and multi-institutional science projects, is documentation of complete and accurate metadata, workflow, and source code in addition to access to raw data and data products to uphold FAIR (Findable, Accessible, Interoperable, Reusable) principles. Although best practice in data/metadata management is to use established internationally accepted metadata schemata, many of these standards are discipline-specific making it difficult to catalog multidisciplinary data and data products in a way that is easily findable and accessible. Consequently, scattered and incompatible metadata records create a barrier to scientific innovation, as researchers are burdened to find and link multidisciplinary datasets. One possible solution to increase data findability, accessibility, interoperability, reproducibility, and integrity within multi-institutional and interdisciplinary projects is a centralized and integrated data management platform. Overall, this type of interoperable framework supports reproducible open science and its dissemination to various stakeholders and the public in a FAIR manner by providing direct access to raw data and linking protocols, metadata and supporting workflow materials.
more »
« less
- Award ID(s):
- 1757324
- PAR ID:
- 10363872
- Publisher / Repository:
- Springer Science + Business Media
- Date Published:
- Journal Name:
- BMC Research Notes
- Volume:
- 15
- Issue:
- 1
- ISSN:
- 1756-0500
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract Reference anatomies of the brain (‘templates’) and corresponding atlases are the foundation for reporting standardized neuroimaging results. Currently, there is no registry of templates and atlases; therefore, the redistribution of these resources occurs either bundled within existing software or in ad hoc ways such as downloads from institutional sites and general-purpose data repositories. We introduce TemplateFlow as a publicly available framework for human and non-human brain models. The framework combines an open database with software for access, management, and vetting, allowing scientists to share their resources under FAIR—findable, accessible, interoperable, and reusable—principles. TemplateFlow enables multifaceted insights into brains across species, and supports multiverse analyses testing whether results generalize across standard references, scales, and in the long term, species.more » « less
-
Open access to scientific data is increasingly recognized as critical to fostering scientific progress, trustworthy and reproducible science, global information equity, and evidence-based policymaking. It requires scientists to not only share their data, but to share in such a way that the data have high utility for later users. The FAIR data principles define a set of characteristics for making data “findable, accessible, interoperable, and reusable” (Wilkinson et al., 2016). Training scientists, particularly early-career scientists, on these principles can improve the volume and quality of open science data.more » « less
-
Direct observations of the oceans acquired on oceanographic research ships operated across the international community support fundamental research into the many disciplines of ocean science and provide essential information for monitoring the health of the oceans. A comprehensive knowledge base is needed to support the responsible stewardship of the oceans with easy access to all data acquired globally. In the United States, the multidisciplinary shipboard sensor data routinely acquired each year on the fleet of coastal, regional and global ranging vessels supporting academic marine research are managed by the Rolling Deck to Repository (R2R, rvdata.us) program. With over a decade of operations, the R2R program has developed a robust routinized system to transform diverse data contributions from different marine data providers into a standardized and comprehensive collection of global-ranging observations of marine atmosphere, ocean, seafloor and subseafloor properties that is openly available to the international research community. In this article we describe the elements and framework of the R2R program and the services provided. To manage all expeditions conducted annually, a fleet-wide approach has been developed using data distributions submitted from marine operators with a data management workflow designed to maximize automation of data curation. Other design goals are to improve the completeness and consistency of the data and metadata archived, to support data citability, provenance tracking and interoperable data access aligned with FAIR (findable, accessible, interoperable, reusable) recommendations, and to facilitate delivery of data from the fleet for global data syntheses. Findings from a collection-level review of changes in data acquisition practices and quality over the past decade are presented. Lessons learned from R2R operations are also discussed including the benefits of designing data curation around the routine practices of data providers, approaches for ensuring preservation of a more complete data collection with a high level of FAIRness, and the opportunities for homogenization of datasets from the fleet so that they can support the broadest re-use of data across a diverse user community.more » « less
-
Incomplete and inconsistent connections between institutional repository holdings and the global data infrastructure inhibit research data discovery and reusability. Preventing metadata loss on the path from institutional repositories to the global research infrastructure can substantially improve research data reusability. The Realities of Academic Data Sharing (RADS) Initiative, funded by the National Science Foundation, is investigating institutional processes for improving research data FAIRness. Focal points of the RADS inquiry are to understand where researchers are sharing their data and to assess metadata quality, i.e., completeness, at six Data Curation Network (DCN) academic institutions: Cornell University, Duke University, University of Michigan, University of Minnesota, Washington University in St. Louis, and Virginia Tech. RADS is examining where researchers are storing their data, considering local institutional repositories and other popular repositories, and analyzing the completeness of the research data metadata stored in these institutional and other repositories. Metadata FAIRness (Findable, Accessible, Interoperable, Reusable) is used as the metric to assess metadata quality as FAIR complete. Research findings show significant content loss when metadata from local institutional repositories are compared to metadata found in DataCite. After examining the factors contributing to this metadata loss, RADS investigators are developing a set of recommended best practices for institutions to increase the quality of their scholarly metadata. Further, documentation such as README files are of particular importance not only for data reuse, but as sources containing valuable metadata such as Persistent Identifiers (PIDs). DOIs and related PIDs such as ORCID and ROR are still rarely used in institutional repositories. More frequent use would have a positive effect on discoverability, interoperability and reusability, especially when transferring to global infrastructure.more » « less