skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on May 4, 2026

Title: A Linguistic Strategy to Measure Negative Affective Polarization Through Text Content
Three studies developed and validated a linguistic dictionary to measure negative affective polarization in English and Spanish political texts. It captures three dimensions: negative affect, delegitimization, and political context. In the first study, two independent judges evaluated the candidate words, and reliability indicators were calculated, showing acceptable values for short texts (.572 in English, .541 in Spanish) and higher values for larger corpora (.964 in English, .957 in Spanish). The second study tested discriminant validity by comparing negative affective polarization scores in social media comments on politics and entertainment. Results showed significantly higher polarization scores in political content, confirming the dictionary's validity. The third study compared the dictionary to an existing online polarization measure, finding greater coverage and alignment with the construct. Additionally, it was observed that polarization scores were higher in texts containing hate speech compared to those where it was absent. The findings suggest that the dictionary in both languages have strong psychometric properties, making it a valuable tool for analyzing online content, particularly social media comments. It can be used as an independent measure or as input for machine and deep learning models.  more » « less
Award ID(s):
2107524
PAR ID:
10587614
Author(s) / Creator(s):
 ;  
Publisher / Repository:
SAGE Publications
Date Published:
Journal Name:
Journal of Language and Social Psychology
ISSN:
0261-927X
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    The explosion of user-generated content (UGC)—e.g. social media posts and comments and and reviews—has motivated the development of NLP applications tailored to these types of informal texts. Prevalent among these applications have been sentiment analysis and machine translation (MT). Grounded in the observation that UGC features highly idiomatic and sentiment-charged language and we propose a decoder-side approach that incorporates automatic sentiment scoring into the MT candidate selection process. We train monolingual sentiment classifiers in English and Spanish and in addition to a multilingual sentiment model and by fine-tuning BERT and XLM-RoBERTa. Using n-best candidates generated by a baseline MT model with beam search and we select the candidate that minimizes the absolute difference between the sentiment score of the source sentence and that of the translation and and perform two human evaluations to assess the produced translations. Unlike previous work and we select this minimally divergent translation by considering the sentiment scores of the source sentence and translation on a continuous interval and rather than using e.g. binary classification and allowing for more fine-grained selection of translation candidates. The results of human evaluations show that and in comparison to the open-source MT baseline model on top of which our sentiment-based pipeline is built and our pipeline produces more accurate translations of colloquial and sentiment-heavy source texts. 
    more » « less
  2. Today, Spanish speaking countries face widespread political crisis. These political conflicts are published in a large volume of Spanish news articles from Spanish agencies. Our goal is to create a fully functioning system that parses realtime Spanish texts and generates scalable event code. Rather than translating Spanish text into English text and using English event coders, we aim to create a tool that uses raw Spanish text and Spanish event coders for better flexibility, coverage, and cost.To accommodate the processing of a large number of Spanish articles, we adapt a distributed framework based on Apache Spark. We highlight how to extend the existing ontology to provide support for the automated coding process for Spanish texts. We also present experimental data to provide insight into the data collection process with filtering unrelated articles, scaling the framework, and gathering basic statistics on the dataset. 
    more » « less
  3. Public sentiment toward the COVID-19 vaccine as expressed on social media can interfere with communication by public health agencies on the importance of getting vaccinated. We investigated Twitter data to understand differences in sentiment, moral values, and language use between political ideologies on the COVID-19 vaccine. We estimated political ideology, conducted a sentiment analysis, and guided by the tenets of moral foundations theory (MFT), we analyzed 262,267 English language tweets from the United States containing COVID-19 vaccine-related keywords between May 2020 and October 2021. We applied the Moral Foundations Dictionary and used topic modeling and Word2Vec to understand moral values and the context of words central to the discussion of the vaccine debate. A quadratic trend showed that extreme ideologies of both Liberals and Conservatives expressed a higher negative sentiment than Moderates, with Conservatives expressing more negative sentiment than Liberals. Compared to Conservative tweets, we found the expression of Liberal tweets to be rooted in a wider set of moral values, associated with moral foundations of care (getting the vaccine for protection), fairness (having access to the vaccine), liberty (related to the vaccine mandate), and authority (trusting the vaccine mandate imposed by the government). Conservative tweets were found to be associated with harm (around safety of the vaccine) and oppression (around the government mandate). Furthermore, political ideology was associated with the expression of different meanings for the same words, e.g. “science” and “death.” Our results inform public health outreach communication strategies to best tailor vaccine information to different groups. 
    more » « less
  4. Candido, Silvio_Eduardo Alvarez (Ed.)
    As social media becomes a key channel for news consumption and sharing, proliferating partisan and mainstream news sources must increasingly compete for users’ attention. While affective qualities of news content may promote engagement, it is not clear whether news source bias influences affective content production or virality, or whether any differences have changed over time. We analyzed the sentiment of ~30 million posts (ontwitter.com) from 182 U.S. news sources that ranged from extreme left to right bias over the course of a decade (2011–2020). Biased news sources (on both left and right) produced more high arousal negative affective content than balanced sources. High arousal negative content also increased reposting for biased versus balanced sources. The combination of increased prevalence and virality for high arousal negative affective content was not evident for other types of affective content. Over a decade, the virality of high arousal negative affective content also increased, particularly in balanced news sources, and in posts about politics. Together, these findings reveal that high arousal negative affective content may promote the spread of news from biased sources, and conversely imply that sentiment analysis tools might help social media users to counteract these trends. 
    more » « less
  5. Prejudice and hate directed toward Asian individuals has increased in prevalence and salience during the COVID-19 pandemic, with notable rises in physical violence. Concurrently, as many governments enacted stay-at-home mandates, the spread of anti-Asian content increased in online spaces, including social media. In the present study, we investigated temporal and geographical patterns in social media content relevant to anti-Asian prejudice during the COVID-19 pandemic. Using the Twitter Data Collection API, we queried over 13 million tweets posted between January 30, 2020, and April 30, 2021, for both negative (e.g., #kungflu) and positive (e.g., #stopAAPIhate) hashtags and keywords related to anti-Asian prejudice. In a series of descriptive analyses, we found differences in the frequency of negative and positive keywords based on geographic location. Using burst detection, we also identified distinct increases in negative and positive content in relation to key political tweets and events. These largely exploratory analyses shed light on the role of social media in the expression and proliferation of prejudice as well as positive responses online. 
    more » « less