skip to main content


The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 10:00 PM ET on Friday, December 8 until 2:00 AM ET on Saturday, December 9 due to maintenance. We apologize for the inconvenience.

Title: Saving social media data: Understanding data management practices among social media researchers and their implications for archives

Social media data (SMD) offer researchers new opportunities to leverage those data for their work in broad areas such as public opinion, digital culture, labor trends, and public health. The success of efforts to save SMD for reuse by researchers will depend on aligning data management and archiving practices with evolving norms around the capture, use, sharing, and security of datasets. This paper presents an initial foray into understanding how established practices for managing and preserving data should adapt to demands from researchers who use and reuse SMD, and from people who are subjects in SMD. We examine the data management practices of researchers who use SMD through a survey, and we analyze published articles that used data from Twitter. We discuss how researchers describe their data management practices and how these practices may differ from the management of conventional data types. We explore conceptual, technical, and ethical challenges for data archives based on the similarities and differences between SMD and other types of research data, focusing on the social sciences. Finally, we suggest areas where archives may need to revise policies, practices, and services in order to create secure, persistent, and usable collections of SMD.

more » « less
Author(s) / Creator(s):
 ;  ;  
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
Journal of the Association for Information Science and Technology
Page Range / eLocation ID:
p. 97-109
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Social media provides unique opportunities for researchers to learn about a variety of phenomena—it is often publicly available, highly accessible, and affords more naturalistic observation. However, as research using social media data has increased, so too has public scrutiny, highlighting the need to develop ethical approaches to social media data use. Prior work in this area has explored users’ perceptions of researchers’ use of social media data in the context of a single platform. In this paper, we expand on that work, exploring how platforms and their affordances impact how users feel about social media data reuse. We present results from three factorial vignette surveys, each focusing on a different platform—dating apps, Instagram, and Reddit—to assess users’ comfort with research data use scenarios across a variety of contexts. Although our results highlight different expectations between platforms depending on the research domain, purpose of research, and content collected, we find that the factor with the greatest impact across all platforms is consent—a finding which presents challenges for big data researchers. We conclude by offering a sociotechnical approach to ethical decision-making. This approach provides recommendations on how researchers can interpret and respond to platform norms and affordances to predict potential data use sensitivities. The approach also recommends that researchers respond to the predominant expectation of notification and consent for research participation by bolstering awareness of data collection on digital platforms. 
    more » « less
  2. Abstract

    Understanding the societal impacts caused by community disruptions (e.g., power outages and road closures), particularly during the response stage, with timeliness and sufficient detail is an underexplored, yet important, consideration. It is critical for effective decision‐making and coordination in disaster response and relief activities as well as post‐disaster virtual reconnaissance activities. This study proposes a semiautomated social media analytics approach for social sensing of Disaster Impacts and Societal Considerations (SocialDISC). This approach addresses two limitations of existing social media analytics approaches: lacking adaptability to the need of different analyzers or different disasters and missing the information related to subjective feelings, emotions, and opinions of the people. SocialDISC labels and clusters social media posts in each disruption category to facilitate scanning by analyzers. Analyzers, in this paper, are persons who acquire social impact information from social media data (e.g., infrastructure management personnel, volunteers, researchers from academia, and some residents impacted by the disaster). Furthermore, SocialDISC enables analyzers to quickly parse topics and emotion signals of each subevent to assess the societal impacts caused by disruption events. To demonstrate the performance of SocialDISC, the authors proposed a case study based on Hurricane Harvey, one of the costliest disasters in U.S. history, and analyzed the disruptions and corresponding societal impacts in different aspects. The analysis result shows that Houstonians suffered greatly from flooded houses, lack of access to food and water, and power outages. SocialDISC can foster an understanding of the relationship between disruptions of infrastructures and societal impacts, expectations of the public when facing disasters, and infrastructure interdependency and cascading failures. SocialDISC's provision of timely information about the societal impacts of people may help disaster response decision‐making.

    more » « less
  3. Synopsis

    Interest in cephalopods as comparative models in neuroscience, cognition, behavior, and ecology is surging due to recent advances in culture and experimental techniques. Although cephalopods have a long history in research, their use had remained limited due to the challenges of funding work on comparative models, the lack of modern techniques applicable to them, and the small number of labs with the facilities to keep and house large numbers of healthy animals for long periods. Breakthroughs in each of these areas are now creating new interest in cephalopods from researchers who trained and worked in other models, as well as allowing established cephalopod labs to grow and collaborate more widely. This broadening of the field is essential to its long-term health, but also brings with it new and heightened scrutiny from animal rights organizations, federal regulatory agencies, and members of the public. As a community, it is critical that scientists working with cephalopods engage in discussions, studies, and communication that promote high standards for cephalopod welfare. The concept of “social license to operate,” more commonly encountered in industry, recreation, and agriculture, provides a useful lens through which to view proactive steps the cephalopod research community may take to ensure a strong future for our field. In this Perspective, I discuss recent progress in cephalopod ethics and welfare studies, and use the conceptual framework of Social License to Operate to propose a forward-looking, public-facing strategy for the parallel development of welfare-focused best practices and scientific breakthroughs.

    more » « less
  4. Abstract

    Social media data offer a rich resource for researchers interested in public health, labor economics, politics, social behaviors, and other topics. However, scale and anonymity mean that researchers often cannot directly get permission from users to collect and analyze their social media data. This article applies the basic ethical principle of respect for persons to consider individuals’ perceptions of acceptable uses of data. We compare individuals’ perceptions of acceptable uses of other types of sensitive data, such as health records and individual identifiers, with their perceptions of acceptable uses of social media data. Our survey of 1018 people shows that individuals think of their social media data as moderately sensitive and agree that it should be protected. Respondents are generally okay with researchers using their data in social research but prefer that researchers clearly articulate benefits and seek explicit consent before conducting research. We argue that researchers must ensure that their research provides social benefits worthy of individual risks and that they must address those risks throughout the research process.

    more » « less
  5. McNeill, Fiona ; Zobel, Christopher (Ed.)
    Information is a critical need during disasters such as hurricanes. Increasingly, people are relying upon cellular and internet-based technology to communicate that information—modalities that are acutely vulnerable to the disruptions to telecommunication infrastructure that are common during disasters. Focusing on Hurricane Maria (2017) and its long-term impacts on Puerto Rico, this research examines how people affected by severe and sustained disruptions to telecommunications services adapt to those disruptions. Leveraging social media trace data as a window into the real-time activities of people who were actively adapting, we use a primarily qualitative approach to identify and characterize how people changed their telecommunications practices and routines—and especially how they changed their locations—to access Wi-Fi and cellular service in the weeks and months after the hurricane. These findings have implications for researchers seeking to better understand human responses to disasters and responders seeking to identify strategies to support affected populations. 
    more » « less