NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

An Information Ecosystem Map of Resources Supporting the Mobilization and Discovery of Paleontological Specimen Data

https://doi.org/10.3897/biss.8.139040

Little, Holly; Karim, Talia; Krimmel, Erica; Walker, Lindsay (October 2024, Biodiversity Information Science and Standards)

Over the last decade, the United States paleontological collections community has invested heavily in the digitization of specimen-based data, including over 10 million USD funded through the National Science Foundation’s Advancing Digitization of Biodiversity Collections program. Fossil specimen data—9.0 million records and counting (Global Biodiversity Information Facility 2024)—are now accessible on open science platforms such as the Global Biodiversity Information Facility (GBIF). However, the full potential of this data is far from realized due to fundamental challenges associated with mobilization, discoverability, and interoperability of paleontological information within the existing cyberinfrastructure landscape and data pipelines. Additionally, it can be difficult for individuals with varying expertise to develop a comprehensive understanding of the existing landscape due to its breadth and complexity. Here, we present preliminary results from a project aiming to explore how we might address these problems. Funding from the US National Science Foundation (NSF) to the University of Colorado Museum of Natural History, Smithsonian National Museum of Natural History, and Arizona State University will result in, among other products, an “ecosystem map” for the paleontological collections community. This map will be an information-rich visualization of entities (e.g. concepts, systems, platforms, mechanisms, drivers, tools, documentation, data, standards, people, organizations) operating in, intersecting with, or existing in parallel to our domain. We are inspired and informed by similar efforts to map the biodiversity informatics landscape (Bingham et al. 2017) and the research infrastructure landscape (Distributed System of Scientific Collections 2024), as well as by many ongoing metadata cataloging projects, e.g. re3data and the Global Registry of Scientific Collections (GRSciColl). Our strategy for developing this ecosystem map is to model the existing information and systems landscape by characterizing entities, e.g. potentially in a graph database as nodes with relationships to other nodes. The ecosystem map will enable us to provide guidance for communities workingacrossdifferent sectors of the landscape, promoting a shared understanding of the ecosystem that everyone works in together. We can also use the map to identify points of entry and engagement at various stages of the paleontological data process, and to engage diverse memberswithinthe paleontological community. We see three primary user types for this map: people new(er) to the community, people with expertise in a subset of the community, and people working to integrate initiatives and systems across communities. Each of these user types needs tailored access to the ecosystem map and its community knowledge. By promoting shared knowledge with the map, users will be able to identify their own space within the ecosystem and the connections or partnerships that they can utilize to expand their knowledge or resources, relieving the burden on any single individual to hold a comprehensive understanding. For example, the flow of taxonomic information between publications, collections, digital resources, and biodiversity aggregators is not straightforward or easy to understand. A person with expertise in collections care may want to use the ecosystem map to understand why taxonomic identifications associated with their specimen occurrence records are showing up incorrectly when published to GBIF. We envision that our final ecosystem map will visualize the flow of taxonomic information and how it is used to interpret specimen occurrence data, thereby highlighting to this user where problems may be happening and whom to ask for help in addressing them (Fig. 1). Ultimately, development of this map will allow us to identify mobilization pathways for paleontological data, highlight core cyberinfrastructure resources, define cyberinfrastructure gaps, strategize future partnerships, promote shared knowledge, and engage a broader array of expertise in the process. Contributing domain-based evidence FAIRly*2 requires expertise that bridges the content (e.g. paleontology) and the mechanics (e.g. informatics). By centering the role of humans in open science cyberinfrastructure throughout our process, we hope to develop systems that create and sustain such expertise.
more » « less
Full Text Available
Community-driven enhancement of information ecosystems for the discovery and use of paleontological specimen data: Stakeholder engagement workshop

https://doi.org/10.3897/rio.10.e134840

Karim, Talia; Krimmel, Erica; Little, Holly; Walker, Lindsay (August 2024, Research Ideas and Outcomes)

A stakeholder engagement workshop was held in May 2024 as part of the Community-driven enhancement of information ecosystems for the discovery and use of paleontological specimen data project, which is funded under the United States National Science Foundation (NSF) Geosciences Open Science Ecosystem (GEO OSE) program. This report describes the activites and outcomes of the workshop.
more » « less
Full Text Available
Finding biotic anomalies described in specimen label text is a challenge that artificial intelligence can address

https://doi.org/10.12685/bauhinia.1374

Mast, Austin; Tian, Shubo; He, Zhe; Krimmel, Erica; Pichardo-Marcano, Fritz; Buckley, Mikayla; Gomez, Sophia; Hennessey, Ashley; Horn, Allyson; Howell, Olivia (February 2024, BAUHINIA – Zeitschrift der Basler Botanischen Gesellschaft)

Biodiversity specimen collectors are on the front lines of observing biotic anomalies, some of which herald early stages of significant changes (e.g., the arrival of a new disease; Pearson and Mast 2019). Online data sharing has opened new possibilities for the discovery of anomaly descriptions on collectors’ labels, but it remains a challenge to find these needles in the haystack of many millions of specimen records available at aggregators like iDigBio and Global Biodiversity Information Facility. In a recent community survey, over 200 collectors identified 170 unique words and phrases (e.g., atypical) that they would use to describe six types of anomaly (Pearson and Mast 2019). Left unanswered was the relative efficiency with which anomaly descriptions can be found using the simple presence of these words. Here, we address that question with a focus on one type of anomaly (phenological; related to the timing of life historyevents) and ask a second question: can we further improve the efficiency of anomaly description discovery by engaging artificial intelligence (AI)?
more » « less
Full Text Available
Collections Education: The Extended Specimen and Data Acumen

https://doi.org/10.1093/biosci/biab109

Monfils, Anna K; Krimmel, Erica R; Linton, Debra L; Marsico, Travis D; Morris, Ashley B; Ruhfel, Brad R (October 2021, BioScience)

Abstract Biodiversity scientists must be fluent across disciplines; they must possess the quantitative, computational, and data skills necessary for working with large, complex data sets, and they must have foundational skills and content knowledge from ecology, evolution, taxonomy, and systematics. To effectively train the emerging workforce, we must teach science as we conduct science and embrace emerging concepts of data acumen alongside the knowledge, tools, and techniques foundational to organismal biology. We present an open education resource that updates the traditional plant collection exercise to incorporate best practices in twenty-first century collecting and to contextualize the activities that build data acumen. Students exposed to this resource gained skills and content knowledge in plant taxonomy and systematics, as well as a nuanced understanding of collections-based data resources. We discuss the importance of the extended specimen in fostering scientific discovery and reinforcing foundational concepts in biodiversity science, taxonomy, and systematics.
more » « less
Full Text Available
Rapid Creation of a Data Product for the World's Specimens of Horseshoe Bats and Relatives, a Known Reservoir for Coronaviruses.

https://doi.org/10.3897/biss.4.59067

Krimmel, Erica; Mast, Austin; Paul, Deborah; Bruhn, Robert; Rios, Nelson; Shorthouse, David (October 2020, Biodiversity Information Science and Standards)
null (Ed.)
Full Text Available
Biodiversity Science and the Twenty-First Century Workforce

https://doi.org/10.1093/biosci/biz147

Ellwood, Elizabeth R; Sessa, Jocelyn Anne; Abraham, Joel K; Budden, Amber E; Douglas, Natalie; Guralnick, Robert; Krimmel, Erica; Langen, Tom; Linton, Debra; Phillips, Molly; et al (December 2019, BioScience)

Full Text Available

Search for: All records