Data sharing and reuse are becoming the norm in quantitative research. At the same time, significant skepticism still accompanies the sharing and reuse of qualitative research data on both ethical and epistemological grounds. Nevertheless, there is growing interest in the reuse of qualitative data, as demonstrated by the range of contributions in this special issue. In this research note, we address epistemological critiques of reusing qualitative data and argue that careful curation of data can enable what we term “epistemologically responsible reuse” of qualitative data. We begin by briefly defining qualitative data and summarizing common epistemological objections to their shareability or usefulness for secondary analysis. We then introduce the concept of curation as enabling epistemologically responsible reuse and a potential way to address such objections. We discuss three recent trends that we believe are enhancing curatorial practices and thus expand the opportunities for responsible reuse: improvements in data management practices among researchers, the development of collaborative curation practices at repositories focused on qualitative data and technological advances that support sharing rich qualitative data. Using three examples of successful reuse of qualitative data, we illustrate the potential of these three trends to further improve the availability of reusable data projects. 
                        more » 
                        « less   
                    
                            
                            Curating for Convergence: Data Stewardship for Interdisciplinary Inquiry
                        
                    
    
            Advances in data infrastructure are often led by disciplinary initiatives aimed at innovation in federation and sharing of data and related research materials. In library and information science (LIS), the data services area has focused on data curation and stewardship to support description and deposit of data for access, reuse, and preservation. At the same time, solutions to societal grand challenges are thought to lie in convergence research, characterized by a problem-focused orientation and deep cross-disciplinary integration, requiring access to highly varied data sources with differing resolutions or scales. We argue that data curation and stewardship work in LIS should expand to foster convergence research based on a robust understanding of the dynamics of disciplinary and interdisciplinary research methods and practices. Highlighting unique contributions by Dr. Linda C. Smith to the field of LIS, we outline how her work illuminates problems that are core to current directions in convergence research. Drawing on advances in data infrastructure in the earth and geosciences and trends in qualitative domains, we emphasize the importance of metastructures and the necessary influence of disciplinary practice on principles, standards, and provisions for ethical use across the evolving data ecosystem. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 1928208
- PAR ID:
- 10470846
- Publisher / Repository:
- John Hopkins University Press
- Date Published:
- Journal Name:
- Library Trends
- Volume:
- 71
- Issue:
- 1
- ISSN:
- 1559-0682
- Page Range / eLocation ID:
- 113 to 131
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            OpenMSIStream provides seamless connection of scientific data stores with streaming infrastructure to allow researchers to leverage the power of decoupled, real-time data streaming architectures. Data streaming is the process of transmitting, ingesting, and processing data continuously rather than in batches. Access to streaming data has revolutionized many industries in the past decade and created entirely new standards of practice and types of analytics. While not yet commonly used in scientific research, data streaming has the potential to become a key technology to drive rapid advances in scientific data collection (e.g., Brookhaven National Lab (2022)). This paucity of streaming infrastructures linking complex scientific systems is due to a lack of tools that facilitate streaming in the diverse and distributed systems common in modern research. OpenMSIStream closes this gap between underlying streaming systems and common scientific infrastructure. Closing this gap empowers novel streaming applications for scientific data including automation of data curation, reduction, and analysis; real-time experiment monitoring and control; and flexible deployment of AI/ML to guide autonomous research. Streaming data generally refers to data continuously generated from multiple sources and passed in small packets (termed messages). Streaming data messages are typically organized in groups called topics and persist for periods of time conducive to processing for multiple uses either sequentially or in small groups. The resulting flows of raw data, metadata, and processing results form “ecosystems” that automate varied data-driven tasks. A strength of data streaming ecosystems is the use of publish-subscribe (“pub/sub”) messaging backbones that decouple data senders (publishers) and recipients (subscribers). Popular message-focused middleware solutions such as RabbitMQ (VMware, 2022), Apache Pulsar (Apache Software Foundation, 2022b), and Apache Kafka (Apache Software Foundation, 2022a) all provide differing capabilities as backbones. OpenMSIStream provides robust and efficient, yet easy, access to the rich data streaming systems of Apache Kafka.more » « less
- 
            null (Ed.)Video data are uniquely suited for research reuse and for documenting research methods and findings. However, curation of video data is a serious hurdle for researchers in the social and behavioral sciences, where behavioral video data are obtained session by session and data sharing is not the norm. To eliminate the onerous burden of post hoc curation at the time of publication (or later), we describe best practices in active data curation—where data are curated and uploaded immediately after each data collection to allow instantaneous sharing with one button press at any time. Indeed, we recommend that researchers adopt “hyperactive” data curation where they openly share every step of their research process. The necessary infrastructure and tools are provided by Databrary—a secure, web-based data library designed for active curation and sharing of personally identifiable video data and associated metadata. We provide a case study of hyperactive curation of video data from the Play and Learning Across a Year (PLAY) project, where dozens of researchers developed a common protocol to collect, annotate, and actively curate video data of infants and mothers during natural activity in their homes at research sites across North America. PLAY relies on scalable standardized workflows to facilitate collaborative research, assure data quality, and prepare the corpus for sharing and reuse throughout the entire research process.more » « less
- 
            Abstract. Global change research demands a convergence among academic disciplines to understand complex changes in Earth system function. Limitations related to data usability and computing infrastructure, however, present barriers to effective use of the research tools needed for this cross-disciplinary collaboration. To address these barriers, we created a computational platform that pairs meteorological data and site-level ecosystem characterizations from the National Ecological Observatory Network (NEON) with the Community Terrestrial System Model (CTSM) that is developed with university partners at the National Center for Atmospheric Research (NCAR). This NCAR–NEON system features a simplified user interface that facilitates access to and use of NEON observations and NCAR models. We present preliminary results that compare observed NEON fluxes with CTSM simulations and describe how the collaboration between NCAR and NEON that can be used by the global change research community improves both the data and model. Beyond datasets and computing, the NCAR–NEON system includes tutorials and visualization tools that facilitate interaction with observational and model datasets and further enable opportunities for teaching and research. By expanding access to data, models, and computing, cyberinfrastructure tools like the NCAR–NEON system will accelerate integration across ecology and climate science disciplines to advance understanding in Earth system science and global change.more » « less
- 
            null (Ed.)Cyber Physical Systems (CPS) are characterized by their ability to integrate the physical and information or cyber worlds. Their deployment in critical infrastructure have demonstrated a potential to transform the world. However, harnessing this potential is limited by their critical nature and the far reaching effects of cyber attacks on human, infrastructure and the environment. An attraction for cyber concerns in CPS rises from the process of sending information from sensors to actuators over the wireless communication medium, thereby widening the attack surface. Traditionally, CPS security has been investigated from the perspective of preventing intruders from gaining access to the system using cryptography and other access control techniques. Most research work have therefore focused on the detection of attacks in CPS. However, in a world of increasing adversaries, it is becoming more difficult to totally prevent CPS from adversarial attacks, hence the need to focus on making CPS resilient. Resilient CPS are designed to withstand disruptions and remain functional despite the operation of adversaries. One of the dominant methodologies explored for building resilient CPS is dependent on machine learning (ML) algorithms. However, rising from recent research in adversarial ML, we posit that ML algorithms for securing CPS must themselves be resilient. This article is therefore aimed at comprehensively surveying the interactions between resilient CPS using ML and resilient ML when applied in CPS. The paper concludes with a number of research trends and promising future research directions. Furthermore, with this article, readers can have a thorough understanding of recent advances on ML-based security and securing ML for CPS and countermeasures, as well as research trends in this active research area.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    