skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Persistent Identifiers for Instruments and Facilities: Current State, Challenges, and Opportunities
Objective: Persistent Identifiers (PIDs) are central to the vision of open science described in the FAIR Principles. However, the use of PIDs for scientific instruments and facilities is decentralized and fragmented. This project aims to develop community-based standards, guidelines, and best practices for how and why PIDs can be assigned to facilities and instruments. Methods: We hosted several online and in-person focus groups and discussions, cumulating in a two-day in-person workshop featuring stakeholders from a variety of organizations and disciplines, such as instrument and facilities operators, PID infrastructure providers, researchers who use instruments and facilities, journal publishers, university administrators, federal funding agencies, and information and data professionals. Results: Our first-year efforts resulted in four main areas of interest: developing a better understanding of the current PID ecosystem; clarifying how and when PIDs could be assigned to scientific instruments and facilities; challenges and barriers involved with assigning PIDs; incentives for researchers, facility managers, and other stakeholders to encourage the use of PIDs. Conclusions: The potential for PIDs to facilitate the discovery, connection, and attribution of research instruments and facilities indicates an obvious value in their use. The lack of standards of how and when they are created, assigned, updated, and used is a major barrier to their widespread use. Data and information professionals can work to create relationships with stakeholders, provide relevant education and outreach activities, and integrate PIDs for instruments and facilities into their data curation and publication workflows.  more » « less
Award ID(s):
2226396 2226397 2226398
PAR ID:
10638801
Author(s) / Creator(s):
 ;  ;  ;  ;  ;  ;  
Publisher / Repository:
Lamar Soutter Library, UMass Chan Medical School
Date Published:
Journal Name:
Journal of eScience Librarianship
Volume:
13
Issue:
3
ISSN:
2161-3974
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    The growing prevalence of data-rich networked information technologies—such as social media platforms, smartphones, wearable devices, and the internet of things —brings an increase in the flow of rich, deep, and often identifiable personal information available for researchers. More than just “big data,” these datasets reflect people’s lives and activities, bridge multiple dimensions of a person’s life, and are often collected, aggregated, exchanged, and mined without them knowing. We call this data “pervasive data,” and the increased scale, scope, speed, and depth of pervasive data available to researchers require that we confront the ethical frameworks that guide such research activities. Multiple stakeholders are embroiled in the challenges of research ethics in pervasive data research: researchers struggle with questions of privacy and consent, user communities may not even be aware of the widespread harvesting of their data for scientific study, platforms are increasingly restricting researcher’s access to data over fears of privacy and security, and ethical review boards face increasing difficulties in properly considering the complexities of research protocols relying on user data collected online. The results presented in this paper expand our understanding of how ethical review board members think about pervasive data research. It provides insights into how IRB professionals make decisions about the use of pervasive data in cases not obviously covered by traditional research ethics guidelines, and points to challenges for IRBs when reviewing research protocols relying on pervasive data. 
    more » « less
  2. Summary Large scientific facilities provide researchers with instrumentation, data, and data products that can accelerate scientific discovery. However, increasing data volumes coupled with limited local computational power prevents researchers from taking full advantage of what these facilities can offer. Many researchers looked into using commercial and academic cyberinfrastructure (CI) to process these data. Nevertheless, there remains a disconnect between large facilities and CI that requires researchers to be actively part of the data processing cycle. The increasing complexity of CI and data scale necessitates new data delivery models, those that can autonomously integrate large‐scale scientific facilities and CI to deliver real‐time data and insights. In this paper, we present our initial efforts using the Ocean Observatories Initiative project as a use case. In particular, we present a subscription‐based data streaming service for data delivery that leverages the Apache Kafka data streaming platform. We also show how our solution can automatically integrate large‐scale facilities with CI services for automated data processing. 
    more » « less
  3. Scientific collections have been built by people. For hundreds of years, people have collected, studied, identified, preserved, documented and curated collection specimens. Understanding who those people are is of interest to historians, but much more can be made of these data by other stakeholders once they have been linked to the people’s identities and their biographies. Knowing who people are helps us attribute work correctly, validate data and understand the scientific contribution of people and institutions. We can evaluate the work they have done, the interests they have, the places they have worked and what they have created from the specimens they have collected. The problem is that all we know about most of the people associated with collections are their names written on specimens. Disambiguating these people is the challenge that this paper addresses. Disambiguation of people often proves difficult in isolation and can result in staff or researchers independently trying to determine the identity of specific individuals over and over again. By sharing biographical data and building an open, collectively maintained dataset with shared knowledge, expertise and resources, it is possible to collectively deduce the identities of individuals, aggregate biographical information for each person, reduce duplication of effort and share the information locally and globally. The authors of this paper aspire to disambiguate all person names efficiently and fully in all their variations across the entirety of the biological sciences, starting with collections. Towards that vision, this paper has three key aims: to improve the linking, validation, enhancement and valorisation of person-related information within and between collections, databases and publications; to suggest good practice for identifying people involved in biological collections; and to promote coordination amongst all stakeholders, including individuals, natural history collections, institutions, learned societies, government agencies and data aggregators. 
    more » « less
  4. The Non-Clinical Tomography Users Research Network (NoCTURN) was established in 2022 to advance Findability, Accessibility, Interoperability, and Reuse (FAIR) and Open Science (OS) practices in the computed tomographic (CT) imaging community. CT specialists utilize a shared pipeline to create digital representations of real-world objects for research, education, and outreach, and we face a shared set of challenges and limitations imposed by siloing of current workflows, best practices, and expertise. Mirroring the U.S. National Science Foundation’s “10 Big Ideas” of Convergence Research (2016), and in consideration of the White House Office of Science and Technology Policy's Nelson Memorandum (2020), NoCTURN is leveraging input from a broad community of more than 100 CT educators, researchers, curators, and industry stakeholders to propose improvements to data handling, management, and sharing that cut across scientific disciplines and extend beyond. Our primary goal is to develop practical recommendations and tools that link today's CT data to tomorrow's CT discoveries. NoCTURN is working toward this goal by providing a platform to: 1) engage the international scientific CT community via participant recruitment from imaging facilities, academic departments and museums, and data repositories across the globe; 2) stimulate improvements for CT imaging and data management standards that focus on FAIR and OS principles; and 3) work directly with private companies that manufacture the hardware and software used in CT imaging, visualization, and analysis to find common ground in documentation and interoperability that better reflects the OS standards championed by federal funding agencies. The planned deliverables from this three-year grant include a ‘Rosetta Stone’ for CT terminology, an interactive world map of CT facilities, a guide to CT repositories, and ‘Good, Better, Best’ guidelines for metadata and long-term data management. We aim to reduce the barriers to entry that isolate individuals and research labs, and we anticipate that developing community standards and improving methodological reporting will enable long-term, systemic changes necessary to aid those at all levels of experience in furthering their access to and use of CT imaging. 
    more » « less
  5. null (Ed.)
    As institutions have struggled to chart a path forward through the current pandemic environment, a greater emphasis has been placed on online and hybrid delivery modes. In first-year programs in particular, instructors are scrambling to identify how best to deliver foundational concepts of engineering design in a remote or socially-distanced in-person environment and still retain the high-interactivity and community building aspects that have become so central to their programs. To this end, two asynchronous, interactive modules have been developed introducing the foundational design concepts of stakeholders, need statements, information gathering, and design specifications. The modules are developed in such a way that student responses to each interaction, such as identifying stakeholders or matching need statements, is captured for later analysis. The modules were deployed with first-semester engineering students enrolled in a Foundations of Design course. In this work the modules are introduced and student responses analyzed to answer the question: What are typical standards of performance on these modules for first-year engineering students? Basic descriptive statistics and trends are presented to define these standards. This includes quantitative measures, such as a how many stakeholders are identified when prompted, as well as more subjective measures, such as how well did the student identify the need in a given problem, and attitudinal measures, such as how confident they are in their answers. 
    more » « less