Over the last decade, significant changes have affected the work that data repositories of all kinds do. First, the emergence of globally unique and persistent identifiers (PIDs) has created new opportunities for repositories to engage with the global research community by connecting existing repository resources to the global research infrastructure. Second, repository use cases have evolved from data discovery to data discovery and reuse, significantly increasing metadata requirements.To respond to these evolving requirements, we need retrospective and on-going curation, i.e. re-curation, processes that 1) find identifiers and add them to existing metadata to connect datasets to a wider range of communities, and 2) add elements that support reuse to globally connected metadata.The goal of this work is to introduce the concept of re-curation with representative examples that are generally applicable to many repositories: 1) increasing completeness of affiliations and identifiers for organizations and funders in the Dryad Repository and 2) measuring and increasing FAIRness of DataCite metadata beyond required fields for institutional repositories.These re-curation efforts are a critical part of reshaping existing metadata and repository processes so they can take advantage of new connections, engage with global research communities, and facilitate data reuse.
- Award ID(s):
- 2129268
- PAR ID:
- 10410505
- Date Published:
- Journal Name:
- Proceedings of the National Academy of Sciences
- Volume:
- 119
- Issue:
- 43
- ISSN:
- 0027-8424
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Spatial data, under the broader umbrella of digital data, is becoming increasingly integral to all stages of archaeological research design and dissemination. As archaeologists lean toward reuse and interoperability, with ethics on their minds, how to treat spatial data is of particular importance. This is because of the complexities involved at every life-cycle stage, from collection to publication, including black box issues that may be taken for granted, and because the size of spatial data can lead to archiving difficulties. Here, the “DIY” momentum of increasingly accessible spatial methods such as photogrammetry and handheld lidar is examined alongside forthcoming changes in publication policies that will impact the United States in particular, framed around a conversation about best practices and a call for more comprehensive training for the archaeological community. At its heart, this special issue seeks to realize the potential of increasingly digitized—and increasingly large amounts of—archaeological data. Within cultural resource management, this means anticipating utilization of data through widespread standardization, among many interrelated activities. A desire to enhance the utility of archaeological data has distinct resonances with the use of spatial data in archaeology, as do some wider challenges that the archaeological community faces moving forward.more » « less
-
Abstract Data archives are an important source of high-quality data in many fields, making them ideal sites to study data reuse. By studying data reuse through citation networks, we are able to learn how hidden research communities—those that use the same scientific data sets—are organized. This paper analyzes the community structure of an authoritative network of data sets cited in academic publications, which have been collected by a large, social science data archive: the Interuniversity Consortium for Political and Social Research (ICPSR). Through network analysis, we identified communities of social science data sets and fields of research connected through shared data use. We argue that communities of exclusive data reuse form “subdivisions” that contain valuable disciplinary resources, while data sets at a “crossroads” broadly connect research communities. Our research reveals the hidden structure of data reuse and demonstrates how interdisciplinary research communities organize around data sets as shared scientific inputs. These findings contribute new ways of describing scientific communities to understand the impacts of research data reuse.
-
Abstract Social media data (SMD) offer researchers new opportunities to leverage those data for their work in broad areas such as public opinion, digital culture, labor trends, and public health. The success of efforts to save SMD for reuse by researchers will depend on aligning data management and archiving practices with evolving norms around the capture, use, sharing, and security of datasets. This paper presents an initial foray into understanding how established practices for managing and preserving data should adapt to demands from researchers who use and reuse SMD, and from people who are subjects in SMD. We examine the data management practices of researchers who use SMD through a survey, and we analyze published articles that used data from Twitter. We discuss how researchers describe their data management practices and how these practices may differ from the management of conventional data types. We explore conceptual, technical, and ethical challenges for data archives based on the similarities and differences between SMD and other types of research data, focusing on the social sciences. Finally, we suggest areas where archives may need to revise policies, practices, and services in order to create secure, persistent, and usable collections of SMD.
-
Drawing on previous research into the value of developing and sharing data stories on social media, we use this paper to examine how practitioners address a spectrum of interests and concerns in relation to their own data literacies within this media form. To do so, we analyzed 107 data story videos from TikTok and Instagram to explore what practices and communication techniques are apparent in social media data stories that exhibit features of data literacy. Through our analysis, we uncovered a series of digital storytelling techniques (e.g., speaking to the camera, using a green screen) that supported the creators’ data science practices and communicative goals. This study contributes to the discourse on social media's role in data storytelling and literacy, providing guidance for future research and implications for the design of new data literacy learning experiences.more » « less