skip to main content


The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 11:00PM ET on Friday, December 15 until 2:00 AM ET on Saturday, December 16 due to maintenance. We apologize for the inconvenience.

Title: Students or Mechanical Turk: Who Are the More Reliable Social Media Data Labelers? [Students or Mechanical Turk: Who Are the More Reliable Social Media Data Labelers?]
Award ID(s):
1934925 1934494
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Proceedings of the 11th International Conference on Data Science, Technology and Applications - DATA
Page Range / eLocation ID:
408 to 415
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We present the results of a survey fielded in June of 2022 as a lens to examine recent data reliability issues on Amazon Mechanical Turk. We contrast bad data from this survey with bad data from the same survey fielded among US workers in October 2013, April 2018, and February 2019. Application of an established data cleaning scheme reveals that unusable data has risen from a little over 2% in 2013 to almost 90% in 2022. Through symptomatic diagnosis, we attribute the data reliability drop not to an increase in bad faith work, but rather to a continuum of English proficiency levels. A qualitative analysis of workers’ responses to open-ended questions allows us to distinguish between low fluency workers, ultra-low fluency workers, satisficers, and bad faith workers. We go on to show the effects of the new low fluency work on Likert scale data and on the study’s qualitative results. Attention checks are shown to be much less effective than they once were at identifying survey responses that should be discarded. 
    more » « less
  2. null (Ed.)
    Considerable amount of laboratory and survey‐based research finds that people show disproportional compassionate and affective response to the scope of human mortality risk. According to research on “psychic numbing,” it is often the case that the more who die, the less we care. In the present article, we examine the extent of this phenomenon in verbal behavior, using large corpora of natural language to quantify the affective reactions to loss of life. We analyze valence, arousal, and specific emotional content of over 100,000 mentions of death in news articles and social media posts, and find that language shows an increase in valence (i.e., decreased negative affect) and a decrease in arousal when describing mortality of larger numbers of people. These patterns are most clearly reflected in specific emotions of joy and (in a reverse fashion) of fear and anger. Our results showcase a novel methodology for studying affective decision making, and highlight the robustness and real‐world relevance of psychic numbing. They also offer new insights regarding the psychological underpinnings of psychic numbing, as well as possible interventions for reducing psychic numbing and overcoming social and psychological barriers to action in the face of the world's most serious threats. 
    more » « less
  3. Psychology is moving increasingly toward digital sources of data, with Amazon’s Mechanical Turk (MTurk) at the forefront of that charge. In 2015, up to an estimated 45% of articles published in the top behavioral and social science journals included at least one study conducted on MTurk. In this article, I summarize my own experience with MTurk and how I deduced that my sample was—at best—only 2.6% valid, by my estimate. I share these results as a warning and call for caution. Recently, I conducted an online study via Amazon’s MTurk, eager and excited to collect my own data for the first time as a doctoral student. What resulted has prompted me to write this as a warning: it is indeed too good to be true. This is a summary of how I determined that, at best, I had gathered valid data from 14 human beings—2.6% of my participant sample ( N = 529).

    more » « less
  4. null (Ed.)
    This paper studies conspiracy and debunking narratives about the origins of COVID-19 on a major Chinese social media platform, Weibo, from January to April 2020. Popular conspiracies about COVID-19 on Weibo, including that the virus is human-synthesized or a bioweapon, differ substan-tially from those in the United States. They attribute more responsibility to the United States than to China, especially following Sino-U.S. confrontations. Compared to conspiracy posts, debunking posts are associated with lower user participation but higher mobilization. Debunking narratives can be more engaging when they come from women and influencers and cite scientists. Our find-ings suggest that conspiracy narratives can carry highly cultural and political orientations. Correc-tion efforts should consider political motives and identify important stakeholders to reconstruct international dialogues toward intercultural understanding. 
    more » « less