The unappreciated role of intent in algorithmic moderation of abusive content on social media

Wang, Xinyu; Koneru, Sai; Venkit, Pranav Narayanan; Frischmann, Brett; Rajtmajer, Sarah

doi:10.37016/mr-2020-180

Citation Details

This content will become publicly available on July 29, 2026

The unappreciated role of intent in algorithmic moderation of abusive content on social media

A significant body of research is dedicated to developing language models that can detect various types of online abuse, for example, hate speech, cyberbullying. However, there is a disconnect between platform policies, which often consider the author's intention as a criterion for content moderation, and the current capabilities of detection models, which typically lack efforts to capture intent. This paper examines the role of intent in the moderation of abusive content. Specifically, we review state-of-the-art detection models and benchmark training datasets to assess their ability to capture intent. We propose changes to the design and development of automated detection and moderation systems to improve alignment with ethical and policy conceptualizations of these abuses. more »

Award ID(s):: 2318460

PAR ID:: 10627848

Author(s) / Creator(s):: Wang, Xinyu; Koneru, Sai; Venkit, Pranav Narayanan; Frischmann, Brett; Rajtmajer, Sarah

Publisher / Repository:: Harvard Kennedy School Misinformation Review

Date Published:: 2025-07-29

Journal Name:: Harvard Kennedy School Misinformation Review

ISSN:: 2766-1652

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
This content will become publicly available on July 29, 2026
Journal Article:
https://doi.org/10.37016/mr-2020-180

More Like this