Abstract The Deep Ocean Observing Strategy (DOOS) is an international, community-driven initiative that facilitates collaboration across disciplines and fields, elevates a diverse cohort of early career researchers into future leaders, and connects scientific advancements to societal needs. DOOS represents a global network of deep-ocean observing, mapping, and modeling experts, focusing community efforts in the support of strong science, policy, and planning for sustainable oceans. Its initiatives work to propose deep-sea Essential Ocean Variables; assess technology development; develop shared best practices, standards, and cross-calibration procedures; and transfer knowledge to policy makers and deep-ocean stakeholders. Several of these efforts align with the vision of the UN Ocean Decade to generate the science we need to create the deep ocean we want. DOOS works toward (1) a healthy and resilient deep ocean by informing science-based conservation actions, including optimizing data delivery, creating habitat and ecological maps of critical areas, and developing regional demonstration projects; (2) a predicted deep ocean by strengthening collaborations within the modeling community, determining needs for interdisciplinary modeling and observing system assessment in the deep ocean; (3) an accessible deep ocean by enhancing open access to innovative low-cost sensors and open-source plans, making deep-ocean data Findable, Accessible, Interoperable, and Reusable, and focusing on capacity development in developing countries; and finally (4) an inspiring and engaging deep ocean by translating science to stakeholders/end users and informing policy and management decisions, including in international waters.
more »
« less
Towards an open-source model for data and metadata standards
Progress in machine learning and artificial intelligence promises to advance research and understanding across a wide range of fields and activities. In tandem, increased awareness of the importance of open data for reproducibility and scientific transparency is making inroads in fields that have not traditionally produced large publicly available datasets. Data sharing requirements from publishers and funders, as well as from other stakeholders, have also created pressure to make datasets with research and/or public interest value available through digital repositories. However, to make the best use of existing data, and facilitate the creation of useful future datasets, robust, interoperable and usable standards need to evolve and adapt over time. The open-source development model provides significant potential benefits to the process of standard creation and adaptation. In particular, data and meta-data standards can use long-standing technical and socio-technical processes that have been key to managing the development of software, and which allow incorporating broad community input into the formulation of these standards. On the other hand, open-source models carry unique risks that need to be considered. This report surveys existing open-source standards development, addressing these benefits and risks. It outlines recommendations for standards developers, funders and other stakeholders on the path to robust, interoperable and usable open-source data and metadata standards.
more »
« less
- Award ID(s):
- 2334483
- PAR ID:
- 10550347
- Publisher / Repository:
- Open Science Framework
- Date Published:
- Format(s):
- Medium: X
- Institution:
- University of Washington
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract Open science and open data within scholarly research programs are growing both in popularity and by requirement from grant funding agencies and journal publishers. A central component of open data management, especially on collaborative, multidisciplinary, and multi-institutional science projects, is documentation of complete and accurate metadata, workflow, and source code in addition to access to raw data and data products to uphold FAIR (Findable, Accessible, Interoperable, Reusable) principles. Although best practice in data/metadata management is to use established internationally accepted metadata schemata, many of these standards are discipline-specific making it difficult to catalog multidisciplinary data and data products in a way that is easily findable and accessible. Consequently, scattered and incompatible metadata records create a barrier to scientific innovation, as researchers are burdened to find and link multidisciplinary datasets. One possible solution to increase data findability, accessibility, interoperability, reproducibility, and integrity within multi-institutional and interdisciplinary projects is a centralized and integrated data management platform. Overall, this type of interoperable framework supports reproducible open science and its dissemination to various stakeholders and the public in a FAIR manner by providing direct access to raw data and linking protocols, metadata and supporting workflow materials.more » « less
-
Abstract Persistent identifiers for research objects, researchers, organizations, and funders are the key to creating unambiguous and persistent connections across the global research infrastructure (GRI). Many repositories are implementing mechanisms to collect and integrate these identifiers into their submission and record curation processes. This bodes well for a well-connected future, but metadata for existing resources submitted in the past are missing these identifiers, thus missing the connections required for inclusion in the connected infrastructure. Re-curation of these metadata is required to make these connections. This paper introduces the global research infrastructure and demonstrates how repositories, and their user communities, can contribute to and benefit from connections to the global research infrastructure. The Dryad Data Repository has existed since 2008 and has successfully re-curated the repository metadata several times, adding identifiers for research organizations, funders, and researchers. Understanding and quantifying these successes depends on measuring repository and identifier connectivity. Metrics are described and applied to the entire repository here. Identifiers (Digital Object Identifiers, DOIs) for papers connected to datasets in Dryad have long been a critical part of the Dryad metadata creation and curation processes. Since 2019, the portion of datasets with connected papers has decreased from 100% to less than 40%. This decrease has significant ramifications for the re-curation efforts described above as connected papers have been an important source of metadata. In addition, missing connections to papers make understanding and re-using datasets more difficult. Connections between datasets and papers can be difficult to make because of time lags between submission and publication, lack of clear mechanisms for citing datasets and other research objects from papers, changing focus of researchers, and other obstacles. The Dryad community of members, i.e. users, research institutions, publishers, and funders have vested interests in identifying these connections and critical roles in the curation and re-curation efforts. Their engagement will be critical in building on the successes Dryad has already achieved and ensuring sustainable connectivity in the future.more » « less
-
Abstract There is a concerted effort to ensure data used in scientific research are made available following “FAIR” standards: Findable, Accessible, Interoperable, and Reusable. With limits in Antarctic field work capacity due to the pandemic and budget pressures, a data repository to house current and future Antarctic meteorological research is crucial. Having broad access to a rich modern Antarctic meteorological data archive of past, present, and ongoing datasets offers new possibilities rather than lost opportunities. To meet the need for increased data access, the National Science Foundation (NSF) has funded the development of a meteorological discipline-based data repository: the Antarctic Meteorological Research and Data Center (AMRDC) Data Repository (ADR). The ADR aims to serve a variety of groups including but not limited to the research, education, and planning communities. The ADR provides metadata and issues digital object identifiers (DOIs) for each dataset in the repository. This is an important feature enabling authors the capability to have the proper citations required for any data used in their peer-reviewed manuscripts. An element that is unique to the ADR is that it also offers links to other Antarctic meteorological data found in other data repositories, including those of other nations. The ADR accepts datasets from the community to be placed into the repository, allowing principal investigators to meet grant data curation and management requirements such as those outlined by the NSF. Significance StatementIn an era of limited field work opportunities and limited funding, science investigations require access to existing datasets to advance weather and climate research. A new repository for Antarctic weather and climate datasets provides the community with a critical resource to meet those needs following common practices and standards.more » « less
-
Abstract. There is a continuously increasing need for reliable feature detection and tracking tools based on objective analysis principles for use with meteorological data. Many tools have been developed over the previous 2 decades that attempt to address this need but most have limitations on the type of data they can be used with, feature computational and/or memory expenses that make them unwieldy with larger datasets, or require some form of data reduction prior to use that limits the tool's utility. The Tracking and Object-Based Analysis of Clouds (tobac) Python package is a modular, open-source tool that improves on the overall generality and utility of past tools. A number of scientific improvements (three spatial dimensions, splits and mergers of features, an internal spectral filtering tool) and procedural enhancements (increased computational efficiency, internal regridding of data, and treatments for periodic boundary conditions) have been included in tobac as a part of the tobac v1.5 update. These improvements have made tobac one of the most robust, powerful, and flexible identification and tracking tools in our field to date and expand its potential use in other fields. Future plans for tobac v2 are also discussed.more » « less
An official website of the United States government

