skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Fitting Round Pegs into a Square Hole: Curating Heterogeneous Oceanographic Data at BCO-DMO
Oceanography is inherently an interdisciplinary science capable of producing highly complex, heterogeneous data that pose unique challenges for data management and reuse. Evolving instrumentation and new research methodologies are increasingly taxing current strategies and technologies for management and reuse of data. Data-related publisher and funder requirements are relatively new demands that researchers must learn to navigate. These are just some of the stressors that repositories experience in their role of curating and publishing FAIR marine-related data. In response, oceanographic repositories are adapting by leveraging community data standards, engaging in the development of new technologies and the usage of novel tools to improve data discovery and interoperability. Additionally, they are collaborating with data-related stakeholders to help shape data-related policy, and fill an education role to promote good data hygiene and bring awareness of concepts like FAIR to the oceanographic research community. This presentation will highlight some of the activities of the BCO-DMO repository that are aimed at advancing the availability and reuse of Open oceanographic data.  more » « less
Award ID(s):
1924618
PAR ID:
10539929
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ;
Publisher / Repository:
MBL-WHOI Library
Date Published:
Edition / Version:
1
Subject(s) / Keyword(s):
Data curation, heterogeneous data, domain curation
Format(s):
Medium: X Size: 12.3MB Other: .pdf
Size(s):
12.3MB
Sponsoring Org:
National Science Foundation
More Like this
  1. The Non-Clinical Tomography Users Research Network (NoCTURN) was established in 2022 to advance Findability, Accessibility, Interoperability, and Reuse (FAIR) and Open Science (OS) practices in the computed tomographic (CT) imaging community. CT specialists utilize a shared pipeline to create digital representations of real-world objects for research, education, and outreach, and we face a shared set of challenges and limitations imposed by siloing of current workflows, best practices, and expertise. Mirroring the U.S. National Science Foundation’s “10 Big Ideas” of Convergence Research (2016), and in consideration of the White House Office of Science and Technology Policy's Nelson Memorandum (2020), NoCTURN is leveraging input from a broad community of more than 100 CT educators, researchers, curators, and industry stakeholders to propose improvements to data handling, management, and sharing that cut across scientific disciplines and extend beyond. Our primary goal is to develop practical recommendations and tools that link today's CT data to tomorrow's CT discoveries. NoCTURN is working toward this goal by providing a platform to: 1) engage the international scientific CT community via participant recruitment from imaging facilities, academic departments and museums, and data repositories across the globe; 2) stimulate improvements for CT imaging and data management standards that focus on FAIR and OS principles; and 3) work directly with private companies that manufacture the hardware and software used in CT imaging, visualization, and analysis to find common ground in documentation and interoperability that better reflects the OS standards championed by federal funding agencies. The planned deliverables from this three-year grant include a ‘Rosetta Stone’ for CT terminology, an interactive world map of CT facilities, a guide to CT repositories, and ‘Good, Better, Best’ guidelines for metadata and long-term data management. We aim to reduce the barriers to entry that isolate individuals and research labs, and we anticipate that developing community standards and improving methodological reporting will enable long-term, systemic changes necessary to aid those at all levels of experience in furthering their access to and use of CT imaging. 
    more » « less
  2. There is great value embedded in reusing scientific data for secondary discoveries. However, it is challenging to find and reuse the large amount of existing scientific data distributed across the web and data repositories. Some of the challenges reside in the volume and complexity of scientific data, others pertain to the current practices and workflow of research data management. AIDR 2019 (Artificial Intelligence for Data Discovery and Reuse) is a new conference that brings together researchers across a broad range of disciplines, computer scientists, tool developers, data providers, and data curators, to share innovative solutions that apply artificial intelligence to scientific data discovery and reuse, and discuss how various stakeholders work together to create a health data ecosystem. This editorial summarizes the main themes and takeaways from the inaugural AIDR '19 conference. 
    more » « less
  3. Over the last decade, significant changes have affected the work that data repositories of all kinds do. First, the emergence of globally unique and persistent identifiers (PIDs) has created new opportunities for repositories to engage with the global research community by connecting existing repository resources to the global research infrastructure. Second, repository use cases have evolved from data discovery to data discovery and reuse, significantly increasing metadata requirements.To respond to these evolving requirements, we need retrospective and on-going curation, i.e. re-curation, processes that 1) find identifiers and add them to existing metadata to connect datasets to a wider range of communities, and 2) add elements that support reuse to globally connected metadata.The goal of this work is to introduce the concept of re-curation with representative examples that are generally applicable to many repositories: 1) increasing completeness of affiliations and identifiers for organizations and funders in the Dryad Repository and 2) measuring and increasing FAIRness of DataCite metadata beyond required fields for institutional repositories.These re-curation efforts are a critical part of reshaping existing metadata and repository processes so they can take advantage of new connections, engage with global research communities, and facilitate data reuse. 
    more » « less
  4. Biodiversity science is in a pivotal period when diverse groups of actors – including researchers, businesses, national governments, and Indigenous Peoples – are negotiating wide-ranging norms for governing and managing biodiversity data in digital repositories. The management of these repositories, often called biodiversity data portals, can serve either to redress or to perpetuate the colonial history of biodiversity science and current inequities. Both researchers and Indigenous Peoples are implementing new strategies to influence whom biodiversity data portals recognise as salient participants in data management and use. Two notable efforts are the FAIR (Findable, Accessible, Interoperable, Reusable) and CARE (Collective benefit, Authority, Responsibility, Ethics) Data Principles. Actors use these principles to influence the governance of biodiversity data portals. ‘Fit-for-use’ data is a social status provided by groups of actors who approve whether the data meets specific purposes. Advocates for the FAIR and CARE Principles use them in a similar way to institutionalise the authority of different groups of actors. However, the FAIR Principles prioritise the ability of machine agents to understand the meanings of data, while the CARE Principles prioritise Indigenous Peoples and their data sovereignty. Together, FAIR and CARE illustrate a broader emerging strategy for institutionalising international norms for digital repositories about who they should recognise as having a formal role in determinations of the fitness-for-use of data. 
    more » « less
  5. Abstract Introduced in 2016, the FAIR Guiding Principles endeavour to significantly improve the process of today's data‐driven research. The Principles present a concise set of fundamental concepts that can facilitate the findability, accessibility, interoperability and reuse (FAIR) of digital research objects by both machines and human beings. The emergence of FAIR has initiated a flurry of activity within the broader data publication community, yet the principles are still not fully understood by many community stakeholders. This has led to challenges such as misinterpretation and co‐opted use, along with persistent gaps in current data publication culture, practices and infrastructure that need to be addressed to achieve a FAIR data end‐state. This paper presents an overview of the practices and perspectives related to the FAIR Principles within the Geosciences and offers discussion on the value of the principles in the larger context of what they are trying to achieve. The authors of this article recommend using the principles as a tool to bring awareness to the types of actions that can improve the practice of data publication to meet the needs of all data consumers. FAIR Guiding Principles should be interpreted as an aspirational guide to focus behaviours that lead towards a more FAIR data environment. The intentional discussions and incremental changes that bring us closer to these aspirations provide the best value to our community as we build the capacity that will support and facilitate new discovery of earth systems. 
    more » « less