skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: FAIR Data and Services in Biodiversity Science and Geoscience
We examine the intersection of the FAIR principles (Findable, Accessible, Interoperable and Reusable), the challenges and opportunities presented by the aggregation of widely distributed and heterogeneous data about biological and geological specimens, and the use of the Digital Object Architecture (DOA) data model and components as an approach to solving those challenges that offers adherence to the FAIR principles as an integral characteristic. This approach will be prototyped in the Distributed System of Scientific Collections (DiSSCo) project, the pan-European Research Infrastructure which aims to unify over 110 natural science collections across 21 countries. We take each of the FAIR principles, discuss them as requirements in the creation of a seamless virtual collection of bio/geo specimen data, and map those requirements to Digital Object components and facilities such as persistent identification, extended data typing, and the use of an additional level of abstraction to normalize existing heterogeneous data structures. The FAIR principles inform and motivate the work and the DO Architecture provides the technical vision to create the seamless virtual collection vitally needed to address scientific questions of societal importance.  more » « less
Award ID(s):
1839013
PAR ID:
10198095
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Data Intelligence
Volume:
2
Issue:
1-2
ISSN:
2641-435X
Page Range / eLocation ID:
122 to 130
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Garoufallou, E. (Ed.)
    Flexible metadata pipelines are crucial for supporting the FAIR data principles. Despite this need, researchers seldom report their approaches for identifying metadata standards and protocols that sup-port optimal flexibility. This paper reports on an initiative targeting the development of a flexible metadata pipeline for a collection contain-ing over 300,000 digital fish specimen images, harvested from multiple data repositories and fish collections. The images and their associated metadata are being used for AI-related scientific research involving au-tomated species identification, segmentation and trait extraction. The paper provides contextual background, followed by the presentation of a four-phased approach involving: 1. Assessment of the Problem, 2. Inves-tigation of Solutions, 3. Implementation, and 4. Refinement. The work is part of the NSF Harnessing the Data Revolution, Biology Guided Neural Networks (NSF/HDR-BGNN) project and the HDR Imageomics Institute. An RDF graph prototype pipeline is presented, followed by a discussion of research implications and conclusion summarizing the re-sults.ite this need, researchers seldom report their approaches for identi-fying metadata standards and protocols that support optimal flexibility. This paper reports on an initiative targeting the development of a flex-ible metadata pipeline for a collection containing over 300,000 digital fish specimen images, harvested from multiple data repositories and fish collections. The images and their associated metadata are being used for AI-related scientific research involving automated species identification, segmentation and trait extraction. The paper provides contextual back-ground, followed by the presentation of a four-phased approach involving: 1. Assessment of the Problem, 2. Investigation of Solutions, 3. Implemen-tation, and 4. Refinement. The work is part of the NSF Harnessing the Data Revolution, Biology Guided Neural Networks (NSF/HDR-BGNN) 
    more » « less
  2. Goldfarb, Keith (Ed.)
    Natural history collections are important depositories of biodiversity data. Digital photography of natural history collection specimens and subsequent dissemination of the resulting images on the web allow for the virtual discovery of these specimens, enhancing their accessibility to the target audience and the public in general. This presentation discusses digital photography of marine mollusks in collections, including some of the latest techniques for imaging of very small specimens, photography of specimens preserved in liquid, haptobionts, problems of color retention, transparency, 3-D photography, equipment, and other current areas of interest. Despite the focus on mollusks, the discussions can be extrapolated as generalities applicable to invertebrates from other phyla. The presentation also includes a discussion on equipment and the ideal digital parameters for imaging of natural history collection specimens, including image policies on acceptable file-format requirements for data hosts and aggregators such as iDigBio and others. (The presentation includes work funded in part by the NSF Thematic Collections Network grant award 2001528 “Mobilizing Millions of Mollusks from the Eastern Seaboard”). 
    more » « less
  3. Scientific workflow management systems (WfMS) provide a systematic way to streamline necessary processes in scientific research. The demand for FAIR (Findable, Accessible, Interoperable, and Reusable) workflows is increasing in the scientific community, particularly in GIScience, where data is not just an output but an integral part of iterative advanced processes. Traditional WfMS often lack the capability to ensure geospatial data and process transparency, leading to challenges in reproducibility and replicability of research findings. This paper proposes the conceptualization and development of FAIR-oriented GIScience WfMS, aiming to incorporate the FAIR principles into the entire lifecycle of geospatial data processing and analysis. To enhance the findability and accessibility of workflows, the WfMS utilizes Harvard Dataverse to share all workflow-related digital resources, organized into workflow datasets, nodes, and case studies. Each resource is assigned a unique DOI (Digital Object Identifier), ensuring easy access and discovery. More importantly, the WfMS complies with the Common Workflow Language (CWL) standard to guarantee interoperability and reproducibility of workflows. It also enables the integration of diverse tools and software, supporting complex analyses that require multiple processing steps. This paper demonstrates the prototype of the GIScience WfMS and illustrates two geospatial science case studies, reflecting its flexibility in selecting appropriate techniques for various datasets and research goals. The user-friendly workflow designer makes it accessible to users with different levels of technical expertise, promoting reusable, reproducible, and replicable GIScience studies. 
    more » « less
  4. This perspective article presents the vision of combining findable, accessible, interoperable, and reusable (FAIR) Digital Objects with the National Science Data Fabric (NSDF) to enhance data accessibility, scientific discovery, and education. Integrating FAIR Digital Objects into the NSDF overcomes data access barriers and facilitates the extraction of machine-actionable metadata in alignment with FAIR principles. The article discusses examples of climate simulations and materials science workflows and establishes the groundwork for a dataflow design that prioritizes inclusivity, web-centricity, and a network-first approach to democratize data access and create opportunities for research and collaboration in the scientific community. 
    more » « less
  5. Abstract The relationship between people, place, and data presents challenges and opportunities for science and society. While there has been general enthusiasm for and work toward Findable, Accessible, Interoperable, and Reusable (FAIR) data for open science, only more recently have these data-centric principles been extended into dimensions important to people and place—notably, the CARE Principles for Indigenous Data Governance, which affect collective benefit, authority to control, responsibility, and ethics. The FAIR Island project seeks to translate these ideals into practice, leveraging the institutional infrastructure provided by scientific field stations. Starting with field stations in French Polynesia as key use cases that are exceptionally well connected to international research networks, FAIR Island builds interoperability between different components of critical research infrastructure, helping connect these to societal benefit areas. The goal is not only to increase reuse of scientific data and the awareness of work happening at the field stations but more generally to accelerate place-based research for sustainable development. FAIR Island works reflexively, aiming to scale horizontally through networks of field stations and to serve as a model for other sites of intensive long-term scientific study. 
    more » « less