Social media data (SMD) offer researchers new opportunities to leverage those data for their work in broad areas such as public opinion, digital culture, labor trends, and public health. The success of efforts to save SMD for reuse by researchers will depend on aligning data management and archiving practices with evolving norms around the capture, use, sharing, and security of datasets. This paper presents an initial foray into understanding how established practices for managing and preserving data should adapt to demands from researchers who use and reuse SMD, and from people who are subjects in SMD. We examine the data management practices of researchers who use SMD through a survey, and we analyze published articles that used data from Twitter. We discuss how researchers describe their data management practices and how these practices may differ from the management of conventional data types. We explore conceptual, technical, and ethical challenges for data archives based on the similarities and differences between SMD and other types of research data, focusing on the social sciences. Finally, we suggest areas where archives may need to revise policies, practices, and services in order to create secure, persistent, and usable collections of SMD.
How do researchers in fieldwork-intensive disciplines protect sensitive data in the field, how do they assess their own practices, and how do they arrive at them? This article reports the results of a qualitative study with 36 semi-structured interviews with qualitative and multi-method researchers in political science and humanitarian aid/migration studies. We find that researchers frequently feel ill-prepared to handle the management of sensitive data in the field and find that formal institutions provide little support. Instead, they use a patchwork of sources to devise strategies for protecting their informants and their data. We argue that this carries substantial risks for the security of the data as well as their potential for later sharing and re-use. We conclude with some suggestions for effectively supporting data management in fieldwork-intensive research without unduly adding to the burden on researchers conducting it.
more » « less- PAR ID:
- 10522956
- Publisher / Repository:
- Cornell Labor Dynamics Institute
- Date Published:
- Journal Name:
- Journal of Privacy and Confidentiality
- Volume:
- 13
- Issue:
- 2
- ISSN:
- 2575-8527
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
Abstract -
This essay draws on qualitative social science to propose a critical intellectual infrastructure for data science of social phenomena. Qualitative sensibilities— interpretivism, abductive reasoning, and reflexivity in particular—could address methodological problems that have emerged in data science and help extend the frontiers of social knowledge. First, an interpretivist lens—which is concerned with the construction of meaning in a given context—can enable the deeper insights that are requisite to understanding high-level behavioral patterns from digital trace data. Without such contextual insights, researchers often misinterpret what they find in large-scale analysis. Second, abductive reasoning—which is the process of using observations to generate a new explanation, grounded in prior assumptions about the world—is common in data science, but its application often is not systematized. Incorporating norms and practices from qualitative traditions for executing, describing, and evaluating the application of abduction would allow for greater transparency and accountability. Finally, data scientists would benefit from increased reflexivity—which is the process of evaluating how researchers’ own assumptions, experiences, and relationships influence their research. Studies demonstrate such aspects of a researcher’s experience that typically are unmentioned in quantitative traditions can influence research findings. Qualitative researchers have long faced these same concerns, and their training in how to deconstruct and document personal and intellectual starting points could prove instructive for data scientists. We believe these and other qualitative sensibilities have tremendous potential to facilitate the production of data science research that is more meaningful, reliable, and ethical.more » « less
-
The criminogenic dimensions of conservation are highly relevant to contemporary protected area management. Research on crime target suitability in the field of criminology has built new understanding regarding how the characteristics of the crime targets affect their suitability for being targeted by offenders. In the last decade, criminologists have sought to apply and adapt target suitability frameworks to explain wildlife related crimes. This study seeks to build upon the extant knowledge base and advance adaptation and application of target suitability research. First, we drew on research, fieldwork, and empirical evidence from conservation science to develop a poaching-stage model with a focus on live specimens or wild animals - rather than a market stage and wildlife product -focused target suitability model. Second, we collected data in the Intensive Protection Zone of Bukit Barisan Selatan National Park (BBSNP), Sumatra, Indonesia through surveys with local community members (n=400), and a three-day focus group with conservation practitioners (n= 25). Our target suitability model, IPOACHED, predicts that species that are in-demand , passive , obtainable , all-purpose , conflict-prone , hideable , extractable , and disposable are more suitable species for poaching and therefore more vulnerable. When applying our IPOACHED model, we find that the most common response to species characteristics that drive poaching in BBSNP was that they are in-demand , with support for cultural or symbolic value (n=101 of respondents, 25%), ecological value (n=164, 35%), and economic value (n=234, 59%). There was moderate support for the conflict-prone dimension of the IPOACHED model (n=70, 18%). Other factors, such as a species lack of passiveness , obtainability and extractability , hamper poaching regardless of value. Our model serves as an explanatory or predictive tool for understanding poaching within a conservation-based management unit (e.g., a protected area) rather than for a specific use market (e.g., pets). Conservation researchers and practitioners can use and adapt our model and survey instruments to help explain and predict poaching of species through the integration of knowledge and opinions from local communities and conservation professionals, with the ultimate goal of preventing wildlife poaching.more » « less
-
Data sharing and reuse are becoming the norm in quantitative research. At the same time, significant skepticism still accompanies the sharing and reuse of qualitative research data on both ethical and epistemological grounds. Nevertheless, there is growing interest in the reuse of qualitative data, as demonstrated by the range of contributions in this special issue. In this research note, we address epistemological critiques of reusing qualitative data and argue that careful curation of data can enable what we term “epistemologically responsible reuse” of qualitative data. We begin by briefly defining qualitative data and summarizing common epistemological objections to their shareability or usefulness for secondary analysis. We then introduce the concept of curation as enabling epistemologically responsible reuse and a potential way to address such objections. We discuss three recent trends that we believe are enhancing curatorial practices and thus expand the opportunities for responsible reuse: improvements in data management practices among researchers, the development of collaborative curation practices at repositories focused on qualitative data and technological advances that support sharing rich qualitative data. Using three examples of successful reuse of qualitative data, we illustrate the potential of these three trends to further improve the availability of reusable data projects.more » « less
-
Data sharing is increasingly an expectation in health research as part of a general move toward more open sciences. In the United States, in particular, the implementation of the 2023 National Institutes of Health Data Management and Sharing Policy has made it clear that qualitative studies are not exempt from this data sharing requirement. Recognizing this trend, the Palliative Care Research Cooperative Group (PCRC) realized the value of creating a de-identified qualitative data repository to complement its existing de-identified quantitative data repository. The PCRC Data Informatics and Statistics Core leadership partnered with the Qualitative Data Repository (QDR) to establish the first serious illness and palliative care qualitative data repository in the U.S. We describe the processes used to develop this repository, called the PCRC-QDR, as well as our outreach and education among the palliative care researcher community, which led to the first ten projects to share the data in the new repository. Specifically, we discuss how we co-designed the PCRC-QDR and created tailored guidelines for depositing and sharing qualitative data depending on the original research context, establishing uniform expectations for key components of relevant documentation, and the use of suitable access controls for sensitive data. We also describe how PCRC was able to leverage its existing community to recruit and guide early depositors and outline lessons learned in evaluating the experience. This work advances the establishment of best practices in qualitative data sharing.more » « less