Alas, coordinated hate attacks, or raids, are becoming increasingly common online. In a nutshell, these are perpetrated by a group of aggressors who organize and coordinate operations on a platform (e.g., 4chan) to target victims on another community (e.g., YouTube). In this paper, we focus on attributing raids to their source community, paving the way for moderation approaches that take the context (and potentially the motivation) of an attack into consideration.We present TUBERAIDER, an attribution system achieving over 75% accuracy in detecting and attributing coordinated hate attacks on YouTube videos. We instantiate it using links to YouTube videos shared on 4chan's /pol/ board, r/The_Donald, and 16 Incels-related subreddits. We use a peak detector to identify a rise in the comment activity of a YouTube video, which signals that an attack may be occurring. We then train a machine learning classifier based on the community language (i.e., TF-IDF scores of relevant keywords) to perform the attribution. We test TUBERAIDER in the wild and present a few case studies of actual aggression attacks identified by it to showcase its effectiveness.
more »
« less
Turning Stocks into Memes: A Dataset for Understanding How Social Communities Can Drive Wall Street
Who actually expresses an intent to buy shares of GameStop Corporation (GME) on Reddit? What convinces people to buy stocks? Are people convinced to support a coordinated plan to adversely impact Wall Street investors? Existing literature on understanding intent has mainly relied on surveys and self-reporting; however there are limitations to these methodologies. Hence, in this paper, we develop an annotated dataset of communications centered on the GameStop phenomenon to analyze the subscriber intention behaviors within the r/WallStreetBets community to buy (or not buy) stocks. Likewise, we curate a dataset to better understand how intent interacts with a user's general support towards the coordinated actions of the community for GameStop. Overall, our dataset can provide insight to social scientists on the persuasive power of social movements online by adopting common language and narrative. WARNING: This paper contains offensive language that commonly appears on Reddit's r/WallStreetBets subreddit.
more »
« less
- Award ID(s):
- 1947697
- PAR ID:
- 10412931
- Date Published:
- Journal Name:
- Proceedings of the International AAAI Conference on Web and Social Media
- Volume:
- 16
- ISSN:
- 2162-3449
- Page Range / eLocation ID:
- 1192 to 1200
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Understanding and characterizing how people interact in information-seeking conversations will be a crucial component in developing effective conversational search systems. In this paper, we introduce a new dataset designed for this purpose and use it to analyze information-seeking conversations by user intent distribution, co-occurrence, and flow patterns. The MSDialog dataset is a labeled conversation dataset of question answering (QA) interactions between information seekers and providers from an online forum on Microsoft products. The dataset contains more than 2,000 multi-turn QA dialogs with 10,000 utterances that are annotated with user intents on the utterance level. Annotations were done using crowdsourcing. With MSDialog, we find some highly recurring patterns in user intent during an information-seeking process. They could be useful for designing conversational search systems. We will make our dataset freely available to encourage exploration of information-seeking conversation models.more » « less
-
In this paper, we explore how robots can properly explain failures during navigation tasks with privacy concerns. We present an integrated robotics approach to generate visual failure explanations, by combining a language-capable cognitive architecture (for recognizing intent behind commands), an object- and location-based context recognition system (for identifying the locations of people and classifying the context in which those people are situated) and an infeasibility proof-based motion planner (for explaining planning failures on the basis of contextually mediated privacy concerns). The behavior of this integrated system is validated using a series of experiments in a simulated medical environment.more » « less
-
During the COVID-19 pandemic, most, if not all, animal rescues, sanctuaries, zoos, and aquariums experienced financial distress. This stress had an impact on the welfare of animals and their human caretakers, an issue important to ecological social work. We draw on a novel dataset (n = 2,060) to assess support for policies to extend emergency funding to animal support and conservation organizations in extreme events. We find that, on average women and nonbinary individuals, those with more education, people who have pets, people who are concerned about other humans (humanistic altruism), and those who have greater concern for animals report greater support.more » « less
-
null (Ed.)Intimacy is a fundamental aspect of how we relate to others in social settings. Language encodes the social information of intimacy through both topics and other more subtle cues (such as linguistic hedging and swearing). Here, we introduce a new computational framework for studying expressions of the intimacy in language with an accompanying dataset and deep learning model for accurately predicting the intimacy level of questions (Pearson r = 0.87). Through analyzing a dataset of 80.5M questions across social media, books, and films, we show that individuals employ interpersonal pragmatic moves in their language to align their intimacy with social settings. Then, in three studies, we further demonstrate how individuals modulate their intimacy to match social norms around gender, social distance, and audience, each validating key findings from studies in social psychology. Our work demonstrates that intimacy is a pervasive and impactful social dimension of language.more » « less
An official website of the United States government

