skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on December 1, 2025

Title: Persistent Identifiers for Instruments and Facilities: Current State, Challenges, and Opportunities
Objective: Persistent Identifiers (PIDs) are central to the vision of open science described in the FAIR Principles. However, the use of PIDs for scientific instruments and facilities is decentralized and fragmented. This project aims to develop community-based standards, guidelines, and best practices for how and why PIDs can be assigned to facilities and instruments. Methods: We hosted several online and in-person focus groups and discussions, cumulating in a two-day in-person workshop featuring stakeholders from a variety of organizations and disciplines, such as instrument and facilities operators, PID infrastructure providers, researchers who use instruments and facilities, journal publishers, university administrators, federal funding agencies, and information and data professionals. Results: Our first-year efforts resulted in four main areas of interest: developing a better understanding of the current PID ecosystem; clarifying how and when PIDs could be assigned to scientific instruments and facilities; challenges and barriers involved with assigning PIDs; incentives for researchers, facility managers, and other stakeholders to encourage the use of PIDs. Conclusions: The potential for PIDs to facilitate the discovery, connection, and attribution of research instruments and facilities indicates an obvious value in their use. The lack of standards of how and when they are created, assigned, updated, and used is a major barrier to their widespread use. Data and information professionals can work to create relationships with stakeholders, provide relevant education and outreach activities, and integrate PIDs for instruments and facilities into their data curation and publication workflows.  more » « less
Award ID(s):
2226396 2226397 2226398
PAR ID:
10638801
Author(s) / Creator(s):
 ;  ;  ;  ;  ;  ;  
Publisher / Repository:
Lamar Soutter Library, UMass Chan Medical School
Date Published:
Journal Name:
Journal of eScience Librarianship
Volume:
13
Issue:
3
ISSN:
2161-3974
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    The growing prevalence of data-rich networked information technologies—such as social media platforms, smartphones, wearable devices, and the internet of things —brings an increase in the flow of rich, deep, and often identifiable personal information available for researchers. More than just “big data,” these datasets reflect people’s lives and activities, bridge multiple dimensions of a person’s life, and are often collected, aggregated, exchanged, and mined without them knowing. We call this data “pervasive data,” and the increased scale, scope, speed, and depth of pervasive data available to researchers require that we confront the ethical frameworks that guide such research activities. Multiple stakeholders are embroiled in the challenges of research ethics in pervasive data research: researchers struggle with questions of privacy and consent, user communities may not even be aware of the widespread harvesting of their data for scientific study, platforms are increasingly restricting researcher’s access to data over fears of privacy and security, and ethical review boards face increasing difficulties in properly considering the complexities of research protocols relying on user data collected online. The results presented in this paper expand our understanding of how ethical review board members think about pervasive data research. It provides insights into how IRB professionals make decisions about the use of pervasive data in cases not obviously covered by traditional research ethics guidelines, and points to challenges for IRBs when reviewing research protocols relying on pervasive data. 
    more » « less
  2. The Non-Clinical Tomography Users Research Network (NoCTURN) was established in 2022 to advance Findability, Accessibility, Interoperability, and Reuse (FAIR) and Open Science (OS) practices in the computed tomographic (CT) imaging community. CT specialists utilize a shared pipeline to create digital representations of real-world objects for research, education, and outreach, and we face a shared set of challenges and limitations imposed by siloing of current workflows, best practices, and expertise. Mirroring the U.S. National Science Foundation’s “10 Big Ideas” of Convergence Research (2016), and in consideration of the White House Office of Science and Technology Policy's Nelson Memorandum (2020), NoCTURN is leveraging input from a broad community of more than 100 CT educators, researchers, curators, and industry stakeholders to propose improvements to data handling, management, and sharing that cut across scientific disciplines and extend beyond. Our primary goal is to develop practical recommendations and tools that link today's CT data to tomorrow's CT discoveries. NoCTURN is working toward this goal by providing a platform to: 1) engage the international scientific CT community via participant recruitment from imaging facilities, academic departments and museums, and data repositories across the globe; 2) stimulate improvements for CT imaging and data management standards that focus on FAIR and OS principles; and 3) work directly with private companies that manufacture the hardware and software used in CT imaging, visualization, and analysis to find common ground in documentation and interoperability that better reflects the OS standards championed by federal funding agencies. The planned deliverables from this three-year grant include a ‘Rosetta Stone’ for CT terminology, an interactive world map of CT facilities, a guide to CT repositories, and ‘Good, Better, Best’ guidelines for metadata and long-term data management. We aim to reduce the barriers to entry that isolate individuals and research labs, and we anticipate that developing community standards and improving methodological reporting will enable long-term, systemic changes necessary to aid those at all levels of experience in furthering their access to and use of CT imaging. 
    more » « less
  3. Summary Large scientific facilities provide researchers with instrumentation, data, and data products that can accelerate scientific discovery. However, increasing data volumes coupled with limited local computational power prevents researchers from taking full advantage of what these facilities can offer. Many researchers looked into using commercial and academic cyberinfrastructure (CI) to process these data. Nevertheless, there remains a disconnect between large facilities and CI that requires researchers to be actively part of the data processing cycle. The increasing complexity of CI and data scale necessitates new data delivery models, those that can autonomously integrate large‐scale scientific facilities and CI to deliver real‐time data and insights. In this paper, we present our initial efforts using the Ocean Observatories Initiative project as a use case. In particular, we present a subscription‐based data streaming service for data delivery that leverages the Apache Kafka data streaming platform. We also show how our solution can automatically integrate large‐scale facilities with CI services for automated data processing. 
    more » « less
  4. AbstractManaging, processing, and sharing research data and experimental context produced on modern scientific instrumentation all present challenges to the materials research community. To address these issues, two MaRDA Working Groups on FAIR Data in Materials Microscopy Metadata and Materials Laboratory Information Management Systems (LIMS) convened and generated recommended best practices regarding data handling in the materials research community. Overall, the Microscopy Metadata Group recommends (1) instruments should capture comprehensive metadata about operators, specimens/samples, instrument conditions, and data formation; and (2) microscopy data and metadata should use standardized vocabularies and community standard identifiers. The LIMS Group produced the following guides and recommendations: (1) a cost and benefit comparison when implementing LIMS; (2) summaries of prerequisite requirements, capabilities, and roles of LIMS stakeholders; and (3) a review of metadata schemas and information-storage best practices in LIMS. Together, the groups hope these recommendations will accelerate breakthrough scientific discoveries via FAIR data. Impact statementWith the deluge of data produced in today’s materials research laboratories, it is critical that researchers stay abreast of developments in modern research data management, particularly as it relates to the international effort to make data more FAIR – findable, accessible, interoperable, and reusable. Most crucially, being able to responsibly share research data is a foundational means to increase progress on the materials research problems of high importance to science and society. Operational data management and accessibility are pivotal in accelerating innovation in materials science and engineering and to address mounting challenges facing our world, but the materials research community generally lags behind its cognate disciplines in these areas. To address this issue, the Materials Research Coordination Network (MaRCN) convened two working groups comprised of experts from across the materials data landscape in order to make recommendations to the community related to improvements in materials microscopy metadata standards and the use of Laboratory Information Management Systems (LIMS) in materials research. This manuscript contains a set of recommendations from the working groups and reflects the culmination of their 18-month efforts, with the hope of promoting discussion and reflection within the broader materials research community in these areas. Graphical abstract 
    more » « less
  5. Scientific collections have been built by people. For hundreds of years, people have collected, studied, identified, preserved, documented and curated collection specimens. Understanding who those people are is of interest to historians, but much more can be made of these data by other stakeholders once they have been linked to the people’s identities and their biographies. Knowing who people are helps us attribute work correctly, validate data and understand the scientific contribution of people and institutions. We can evaluate the work they have done, the interests they have, the places they have worked and what they have created from the specimens they have collected. The problem is that all we know about most of the people associated with collections are their names written on specimens. Disambiguating these people is the challenge that this paper addresses. Disambiguation of people often proves difficult in isolation and can result in staff or researchers independently trying to determine the identity of specific individuals over and over again. By sharing biographical data and building an open, collectively maintained dataset with shared knowledge, expertise and resources, it is possible to collectively deduce the identities of individuals, aggregate biographical information for each person, reduce duplication of effort and share the information locally and globally. The authors of this paper aspire to disambiguate all person names efficiently and fully in all their variations across the entirety of the biological sciences, starting with collections. Towards that vision, this paper has three key aims: to improve the linking, validation, enhancement and valorisation of person-related information within and between collections, databases and publications; to suggest good practice for identifying people involved in biological collections; and to promote coordination amongst all stakeholders, including individuals, natural history collections, institutions, learned societies, government agencies and data aggregators. 
    more » « less