NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Abusive Language Detection in Heterogeneous Contexts: Dataset Collection and the Role of Supervised Attention

Gong, Hongyu; Valido, Alberto; Ingram, Katherine; Fanti, Giulia; Bhat, Suma; Espelage, D. (May 2021, Proceedings of the AAAI Conference on Artificial Intelligence)
null (Ed.)
Abusive language is a massive problem in online social platforms. Existing abusive language detection techniques are particularly ill-suited to comments containing heterogeneous abusive language patterns, i.e., both abusive and non-abusive parts. This is due in part to the lack of datasets that explicitly annotate heterogeneity in abusive language. We tackle this challenge by providing an annotated dataset of abusive language in over 11,000 comments from YouTube. We account for heterogeneity in this dataset by separately annotating both the comment as a whole and the individual sentences that comprise each comment. We then propose an algorithm that uses a supervised attention mechanism to detect and categorize abusive content using multi-task learning. We empirically demonstrate the challenges of using traditional techniques on heterogeneous content and the comparative gains in performance of the proposed approach over state-of-the-art methods.
more » « less
Full Text Available
Generate, Prune, Select: A Pipeline for Counterspeech Generation \\against Online Hate Speech

Zhu, Wanzheng.; Bhat, Suma (January 2021, Findings of the Association for Computational Linguistics)
null (Ed.)
Countermeasures to effectively fight the ever increasing hate speech online without blocking freedom of speech is of great social interest. Natural Language Generation (NLG), is uniquely capable of developing scalable solutions. However, off-the-shelf NLG methods are primarily sequence-to-sequence neural models and they are limited in that they generate commonplace, repetitive and safe responses regardless of the hate speech (\eg, ``Please refrain from using such language.") or irrelevant responses, making them ineffective for de-escalating hateful conversations. In this paper, we design a three-module pipeline approach to effectively improve the diversity} and relevance. Our proposed pipeline first generates various counterspeech candidates by a generative model to promote \textit{diversity}, then filters the ungrammatical ones using a BERT model, and finally selects the most \textit{relevant} counterspeech response using a novel retrieval-based method. Extensive Experiments on three representative datasets demonstrate the efficacy of our approach in generating diverse and relevant counterspeech.
more » « less
Full Text Available
Self-Supervised Euphemism Detection and Identification for Content Moderation

Zhu, Wanzheng Zhu; Gong, Hongyu; Bansal, Rohan; Weinberg, Zachary.; Christin, Nicolas; Fanti, Giulia; Bhat, Suma (January 2021, 2021 IEEE Symposium on Security and Privacy (SP))
null (Ed.)
Fringe groups and organizations have a long history of using euphemisms---ordinary-sounding words with a secret meaning---to conceal what they are discussing. Nowadays, one common use of euphemisms is to evade content moderation policies enforced by social media platforms. Existing tools for enforcing policy automatically rely on keyword searches for words on a ``ban list'', but these are notoriously imprecise: even when limited to swearwords, they can still cause embarrassing false positives. When a commonly used ordinary word acquires a euphemistic meaning, adding it to a keyword-based ban list is hopeless: consider ``pot'' (storage container or marijuana?) or ``heater'' (household appliance or firearm?). The current generation of social media companies instead hire staff to check posts manually, but this is expensive, inhumane, and not much more effective. It is usually apparent to a human moderator that a word is being used euphemistically, but they may not know what the secret meaning is, and therefore whether the message violates policy. Also, when a euphemism is banned, the group that used it need only invent another one, leaving moderators one step behind. This paper will demonstrate unsupervised algorithms that, by analyzing words in their sentence-level context, can both detect words being used euphemistically, and identify the secret meaning of each word. Compared to the existing state of the art, which uses context-free word embeddings, our algorithm for detecting euphemisms achieves 30--400\% higher detection accuracies of unlabeled euphemisms in a text corpus. Our algorithm for revealing euphemistic meanings of words is the first of its kind, as far as we are aware. In the arms race between content moderators and policy evaders, our algorithms may help shift the balance in the direction of the moderators.
more » « less
Full Text Available

Search for: All records