A stakeholder engagement workshop was held in May 2024 as part of the Community-driven enhancement of information ecosystems for the discovery and use of paleontological specimen data project, which is funded under the United States National Science Foundation (NSF) Geosciences Open Science Ecosystem (GEO OSE) program. This report describes the activites and outcomes of the workshop.
more »
« less
This content will become publicly available on September 4, 2026
Community-driven enhancement of information ecosystems for the discovery and use of palaeontological specimen data: Cyberinfrastructure alignment workshop
A two day cyberinfrastructure alignment workshop was held in May 2025 as part of the Community-driven enhancement of information ecosystems for the discovery and use of paleontological specimen data project, which is funded under the United States National Science Foundation (NSF) Geosciences Open Science Ecosystem (GEO OSE) programme. Participants with expertise in informatics, technical cyberinfrastructure development and management and geo- and biological sciences were invited to foster a shared and increased understanding across this broad-community of the needs for palaeontological specimen data. This report describes the activities and outcomes of the workshop and how they will contribute to final deliverables for the grant funded project.
more »
« less
- PAR ID:
- 10647540
- Publisher / Repository:
- Research Ideas and Outcomes
- Date Published:
- Journal Name:
- Research Ideas and Outcomes
- Volume:
- 11
- ISSN:
- 2367-7163
- Subject(s) / Keyword(s):
- paleontology palaeontology fossil geology biodiversity collection natural history collection specimen cyberinfrastructure
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Over the last decade, the United States paleontological collections community has invested heavily in the digitization of specimen-based data, including over 10 million USD funded through the National Science Foundation’s Advancing Digitization of Biodiversity Collections program. Fossil specimen data—9.0 million records and counting (Global Biodiversity Information Facility 2024)—are now accessible on open science platforms such as the Global Biodiversity Information Facility (GBIF). However, the full potential of this data is far from realized due to fundamental challenges associated with mobilization, discoverability, and interoperability of paleontological information within the existing cyberinfrastructure landscape and data pipelines. Additionally, it can be difficult for individuals with varying expertise to develop a comprehensive understanding of the existing landscape due to its breadth and complexity. Here, we present preliminary results from a project aiming to explore how we might address these problems. Funding from the US National Science Foundation (NSF) to the University of Colorado Museum of Natural History, Smithsonian National Museum of Natural History, and Arizona State University will result in, among other products, an “ecosystem map” for the paleontological collections community. This map will be an information-rich visualization of entities (e.g. concepts, systems, platforms, mechanisms, drivers, tools, documentation, data, standards, people, organizations) operating in, intersecting with, or existing in parallel to our domain. We are inspired and informed by similar efforts to map the biodiversity informatics landscape (Bingham et al. 2017) and the research infrastructure landscape (Distributed System of Scientific Collections 2024), as well as by many ongoing metadata cataloging projects, e.g. re3data and the Global Registry of Scientific Collections (GRSciColl). Our strategy for developing this ecosystem map is to model the existing information and systems landscape by characterizing entities, e.g. potentially in a graph database as nodes with relationships to other nodes. The ecosystem map will enable us to provide guidance for communities workingacrossdifferent sectors of the landscape, promoting a shared understanding of the ecosystem that everyone works in together. We can also use the map to identify points of entry and engagement at various stages of the paleontological data process, and to engage diverse memberswithinthe paleontological community. We see three primary user types for this map: people new(er) to the community, people with expertise in a subset of the community, and people working to integrate initiatives and systems across communities. Each of these user types needs tailored access to the ecosystem map and its community knowledge. By promoting shared knowledge with the map, users will be able to identify their own space within the ecosystem and the connections or partnerships that they can utilize to expand their knowledge or resources, relieving the burden on any single individual to hold a comprehensive understanding. For example, the flow of taxonomic information between publications, collections, digital resources, and biodiversity aggregators is not straightforward or easy to understand. A person with expertise in collections care may want to use the ecosystem map to understand why taxonomic identifications associated with their specimen occurrence records are showing up incorrectly when published to GBIF. We envision that our final ecosystem map will visualize the flow of taxonomic information and how it is used to interpret specimen occurrence data, thereby highlighting to this user where problems may be happening and whom to ask for help in addressing them (Fig. 1). Ultimately, development of this map will allow us to identify mobilization pathways for paleontological data, highlight core cyberinfrastructure resources, define cyberinfrastructure gaps, strategize future partnerships, promote shared knowledge, and engage a broader array of expertise in the process. Contributing domain-based evidence FAIRly*2 requires expertise that bridges the content (e.g. paleontology) and the mechanics (e.g. informatics). By centering the role of humans in open science cyberinfrastructure throughout our process, we hope to develop systems that create and sustain such expertise.more » « less
-
The CSSI 2019 workshop was held on October 28-29, 2019, in Austin, Texas. The main objectives of this workshop were to (1) understand the impact of the CSSI program on the community over the last 9 years, (2) engage workshop participants in identifying gaps and opportunities in the current CSSI landscape, (3) gather ideas on the cyberinfrastructure needs and expectations of the community with respect to the CSSI program, and (4) prepare a report summarizing the feedback gathered from the community that can inform the future solicitations of the CSSI program. The workshop participants included a diverse mix of researchers and practitioners from academia, industry, and national laboratories. The participants belonged to diverse domains such as quantum physics, computational biology, High Performance Computing (HPC), and library science. Almost 50% participants were from computer science domain and roughly 50% were from non-computer science domains. As per the self-reported statistics, roughly 27% of the participants were from the different underrepresented groups as defined by the National Science Foundation (NSF). The workshop brought together different stakeholders interested in provisioning sustainable cyberinfrastructure that can power discoveries impacting the various fields of science and technology and maintaining the nation's competitiveness in the areas such as scientific software, HPC, networking, cybersecurity, and data/information science. The workshop served as a venue for gathering the community-feedback on the current state of the CSSI program and its future directions. Before they arrived at the workshop, the participants were encouraged to take an online survey on the challenges that they face in using the current cyberinfrastructure and the importance of the CSSI program in enabling cutting-edge research. The workshop included 16 brain-storming sessions of one hour each. Additionally, the workshop program included 16 lightning talks and an extempore session. The information collected from the survey, brainstorming sessions, lightning talks, and the extempore session are summarized in this report and can potentially be useful for the NSF in formulating the future CSSI solicitations. The workshop fostered an environment in which the participants were encouraged to identify gaps and opportunities in the current cyberinfrastructure landscape, and develop thoughts for proposing new projects.more » « less
-
null (Ed.)The National Science Foundation Office of Advanced Cyberinfrastructure (NSF-OAC) funded a workshop in March 2019 focused on advancing the sharing of machine-readable chemical structures and spectra. Around 40 stakeholders from the chemistry, chemical information, and software communities took part in the two-day workshop entitled “FAIR Chemical Data Publishing Guidelines for Chemical Structures and Spectra.” Major topics discussed included publishing data workflows and guidelines, FAIR criteria/metadata profiles, value propositions, a publisher implementation pilot, and community support and engagement. This report summarizes the workshop conversations, major outcomes, and target areas for further activities. Primary outcomes from the workshop include identification of key metadata elements for sharing machine-readable structures and spectra, a sample of concise author guidelines, and a publisher proposal to accept enhanced supporting information files including these data types and associated metadata alongside articles. Selected target areas for further activities include the creation of author file and metadata packaging tools to facilitate easy compilation of data, and increased training for stakeholders specifically in the generation and handling of machine-readable file formats. We conclude this report with our outlooks and highlight several related community efforts initiated after the workshop.more » « less
-
We studied 11 long-term data infrastructure projects, most of which focused on the Earth Sciences, to understand characteristics that contributed to their project sustainability. Among our sample group, we noted the existence of three different types of project groupings: Database, Framework, and Middleware. Most efforts started as federally funded research projects, and our results show that nearly all became organizations in order to become sustainable. Projects were often funded for short time scales but had the long-term burden of sustaining and supporting open science, interoperability, and community building–activities that are difficult to fund directly. This transition from ‘project’ to ‘organization’ was challenging for most efforts, especially in regard to leadership change and funding issues.Some common approaches to sustainability were identified within each project grouping. Framework and Database projects both relied heavily on the commitment to, and contribution from, a disciplinary community. Framework projects often used bottom-up governance approaches to maintain the active participation and interest of their community. Database projects succeeded when they were able to position themselves as part of the core workflow for disciplinary-specific scientific research. Middleware projects borrowed heavily from sustainability models used by software companies, while maintaining strong scientific partnerships. Cyberinfrastructure for science requires considerable resources to develop and sustain itself, and much of these resources are provided through in-kind support from academics, researchers, and their institutes. It is imperative that more work is done to find appropriate models that help sustain key data infrastructure for Earth Science over the long-term.more » « less
An official website of the United States government
