skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on April 18, 2026

Title: The Insight-Inference Loop: Efficient Text Classification via Natural Language Inference and Threshold-Tuning
Modern computational text classification methods have brought social scientists tantalizingly close to the goal of unlocking vast insights buried in text data—from centuries of historical documents to streams of social media posts. Yet three barriers still stand in the way: the tedious labor of manual text annotation, the technical complexity that keeps these tools out of reach for many researchers, and, perhaps most critically, the challenge of bridging the gap between sophisticated algorithms and the deep theoretical understanding social scientists have already developed about human interactions, social structures, and institutions. To counter these limitations, we propose an approach to large-scale text analysis that requires substantially less human-labeled data, and no machine learning expertise, and efficiently integrates the social scientist into critical steps in the workflow. This approach, which allows the detection of statements in text, relies on large language models pre-trained for natural language inference, and a “few-shot” threshold-tuning algorithm rooted in active learning principles. We describe and showcase our approach by analyzing tweets collected during the 2020 U.S. presidential election campaign, and benchmark it against various computational approaches across three datasets.  more » « less
Award ID(s):
2243822
PAR ID:
10646787
Author(s) / Creator(s):
 ;  ;  ;  ;  
Publisher / Repository:
Sage
Date Published:
Journal Name:
Sociological Methods & Research
ISSN:
0049-1241
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Qualitative coding, or content analysis, is more than just labeling text: it is a reflexive interpretive practice that shapes research questions, refines theoretical insights, and illuminates subtle social dynamics. As large language models (LLMs) become increasingly adept at nuanced language tasks, questions arise about whether—and how—they can assist in large-scale coding without eroding the interpretive depth that distinguishes qualitative analysis from traditional machine learning and other quantitative approaches to natural language processing. In this paper, we present a hybrid approach that preserves hermeneutic value while incorporating LLMs to scale the application of codes to large data sets that are impractical for manual coding. Our workflow retains the traditional cycle of codebook development and refinement, adding an iterative step to adapt definitions for machine comprehension, before ultimately replacing manual with automated text categorization. We demonstrate how to rewrite code descriptions for LLM-interpretation, as well as how structured prompts and prompting the model to explain its coding decisions (chain-of-thought) can substantially improve fidelity. Empirically, our case study of socio-historical codes highlights the promise of frontier AI language models to reliably interpret paragraph-long passages representative of a humanistic study. Throughout, we emphasize ethical and practical considerations, preserving space for critical reflection, and the ongoing need for human researchers’ interpretive leadership. These strategies can guide both traditional and computational scholars aiming to harness automation effectively and responsibly—maintaining the creative, reflexive rigor of qualitative coding while capitalizing on the efficiency afforded by LLMs. 
    more » « less
  2. Upon encountering this publication, one might ask the obvious question, "Why do we need another deep learning and natural language processing book?" Several excellent ones have been published, covering both theoretical and practical aspects of deep learning and its application to language processing. However, from our experience teaching courses on natural language processing, we argue that, despite their excellent quality, most of these books do not target their most likely readers. The intended reader of this book is one who is skilled in a domain other than machine learning and natural language processing and whose work relies, at least partially, on the automated analysis of large amounts of data, especially textual data. Such experts may include social scientists, political scientists, biomedical scientists, and even computer scientists and computational linguists with limited exposure to machine learning. Existing deep learning and natural language processing books generally fall into two camps. The first camp focuses on the theoretical foundations of deep learning. This is certainly useful to the aforementioned readers, as one should understand the theoretical aspects of a tool before using it. However, these books tend to assume the typical background of a machine learning researcher and, as a consequence, I have often seen students who do not have this background rapidly get lost in such material. To mitigate this issue, the second type of book that exists today focuses on the machine learning practitioner; that is, on how to use deep learning software, with minimal attention paid to the theoretical aspects. We argue that focusing on practical aspects is similarly necessary but not sufficient. Considering that deep learning frameworks and libraries have gotten fairly complex, the chance of misusing them due to theoretical misunderstandings is high. We have commonly seen this problem in our courses, too. This book, therefore, aims to bridge the theoretical and practical aspects of deep learning for natural language processing. We cover the necessary theoretical background and assume minimal machine learning background from the reader. Our aim is that anyone who took introductory linear algebra and calculus courses will be able to follow the theoretical material. To address practical aspects, this book includes pseudo code for the simpler algorithms discussed and actual Python code for the more complicated architectures. The code should be understandable by anyone who has taken a Python programming course. After reading this book, we expect that the reader will have the necessary foundation to immediately begin building real-world, practical natural language processing systems, and to expand their knowledge by reading research publications on these topics. https://doi.org/10.1017/9781009026222 
    more » « less
  3. In the past decade, a number of sophisticated AI-powered systems and tools have been developed and released to the scientific community and the public. These technical developments have occurred against a backdrop of political and social upheaval that is both magnifying and magnified by public health and macroeconomic crises. These technical and socio-political changes offer multiple lenses to contextualize (or distort) scientific reflexivity. Further, to computational social scientists who study computer-mediated human behavior, they have implications on what we study and how we study it. How should the ICWSM community engage with this changing world? Which disruptions should we embrace, and which ones should we resist? Whom do we ally with, and for what purpose? In this workshop co-located with ICWSM, we invited experience-based perspectives on these questions with the intent of drafting a collective research agenda for the computational social science community. We did so via the facilitation of collaborative position papers and the discussion of imminent challenges we face in the context of, for example, proprietary large language models, an increasingly unwieldy peer review process, and growing issues in data collection and access. This document presents a summary of the contributions and discussions in the workshop. 
    more » « less
  4. Recently, there have been significant advances and wide-scale use of generative AI in natural language generation. Models such as OpenAI’s GPT3 and Meta’s LLaMA are widely used in chatbots, to summarize documents, and to generate creative content. These advances raise concerns about abuses of these models, especially in social media settings, such as large-scale generation of disinformation, manipulation campaigns that use AI-generated content, and personalized scams. We used stylometry (the analysis of style in natural language text) to analyze the style of AI-generated text. Specifically, we applied an existing authorship verification (AV) model that can predict if two documents are written by the same author on texts generated by GPT2, GPT3, ChatGPT and LLaMA. Our AV model was trained only on human-written text and was effectively used in social media settings to analyze cases of abuse. We generated texts by providing the language models with fanfiction snippets and prompting them to complete the rest of it in the same writing style as the original snippet. We then applied the AV model across the texts generated by the language models and the human written texts to analyze the similarity of the writing styles between these texts. We found that texts generated with GPT2 had the highest similarity to the human texts. Texts generated by GPT3 and ChatGPT were very different from the human snippet, and were similar to each other. LLaMA-generated texts had some similarity to the original snippet but also has similarities with other LLaMA-generated texts and texts from other models. We then conducted a feature analysis to identify the features that drive these similarity scores. This analysis helped us answer questions like which features distinguish the language style of language models and humans, which features are different across different models, and how these linguistic features change over different language model versions. The dataset and the source code used in this analysis have been made public to allow for further analysis of new language models. 
    more » « less
  5. Abstract Large language models (LLMs) are capable of successfully performing many language processing tasks zero-shot (without training data). If zero-shot LLMs can also reliably classify and explain social phenomena like persuasiveness and political ideology, then LLMs could augment the computational social science (CSS) pipeline in important ways. This work provides a road map for using LLMs as CSS tools. Towards this end, we contribute a set of prompting best practices and an extensive evaluation pipeline to measure the zero-shot performance of 13 language models on 25 representative English CSS benchmarks. On taxonomic labeling tasks (classification), LLMs fail to outperform the best fine-tuned models but still achieve fair levels of agreement with humans. On free-form coding tasks (generation), LLMs produce explanations that often exceed the quality of crowdworkers’ gold references. We conclude that the performance of today’s LLMs can augment the CSS research pipeline in two ways: (1) serving as zero-shot data annotators on human annotation teams, and (2) bootstrapping challenging creative generation tasks (e.g., explaining the underlying attributes of a text). In summary, LLMs are posed to meaningfully participate in social science analysis in partnership with humans. 
    more » « less