Large Language Model Annotation Bias in Hate Speech Detection

Okpala, Ebuka; Cheng, Long

doi:10.1609/icwsm.v19i1.35879

Citation Details

This content will become publicly available on June 7, 2026

Large Language Model Annotation Bias in Hate Speech Detection

Large language models (LLMs) are fast becoming ubiquitous and have shown impressive performance in various natural language processing (NLP) tasks. Annotating data for downstream applications is a resource-intensive task in NLP. Recently, the use of LLMs as a cost-effective data annotator for annotating data used to train other models or as an assistive tool has been explored. Yet, little is known regarding the societal implications of using LLMs for data annotation. In this work, focusing on hate speech detection, we investigate how using LLMs such as GPT-4 and Llama-3 for hate speech detection can lead to different performances for different text dialects and racial bias in online hate detection classifiers. We used LLMs to predict hate speech in seven hate speech datasets and trained classifiers on the LLM annotations of each dataset. Using tweets written in African-American English (AAE) and Standard American English (SAE), we show that classifiers trained on LLM annotations assign tweets written in AAE to negative classes (e.g., hate, offensive, abuse, racism, etc.) at a higher rate than tweets written in SAE and that the classifiers have a higher false positive rate towards AAE tweets. We explore the effect of incorporating dialect priming in the prompting techniques used in prediction, showing that introducing dialect increases the rate at which AAE tweets are assigned to negative classes. more »

Award ID(s):: 2228616

PAR ID:: 10612700

Author(s) / Creator(s):: Okpala, Ebuka; Cheng, Long

Publisher / Repository:: Proceedings of the International AAAI Conference on Web and Social Media

Date Published:: 2025-06-07

Journal Name:: Proceedings of the International AAAI Conference on Web and Social Media

Volume:: 19

ISSN:: 2162-3449

Page Range / eLocation ID:: 1389 to 1418

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
This content will become publicly available on June 7, 2026
Journal Article:
https://doi.org/10.1609/icwsm.v19i1.35879

More Like this