"My Very Subjective Human Interpretation": Domain Expert Perspectives on Navigating the Text Analysis Loop for Topic Models

Schofield, Alexandra; Wu, Siqi; Bayard_de_Volo, Theo; Kuze, Tatsuki; Gomez, Alfredo; Sultana, Sharifa

doi:10.1145/3701201

Citation Details

"My Very Subjective Human Interpretation": Domain Expert Perspectives on Navigating the Text Analysis Loop for Topic Models

Practitioners dealing with large text collections frequently use topic models such as Latent Dirichlet Allocation (LDA) and Non-negative Matrix Factorization (NMF) in their projects to explore trends. Despite twenty years of accrued advancement in natural language processing tools, these models are found to be slow and challenging to apply to text exploration projects. In our work, we engaged with practitioners (n=15) who use topic modeling to explore trends in large text collections to understand their project workflows and investigate which factors often slow down the processes and how they deal with such errors and interruptions in automated topic modeling. Our findings show that practitioners are required to diagnose and resolve context-specific problems with preparing data and models and need control for these steps, especially for data cleaning and parameter selection. Our major findings resonate with existing work across CSCW, computational social science, machine learning, data science, and digital humanities. They also leave us questioning whether automation is actually a useful goal for tools designed for topic models and text exploration. more »

Award ID(s):: 2243941

PAR ID:: 10583990

Author(s) / Creator(s):: Schofield, Alexandra; Wu, Siqi; Bayard_de_Volo, Theo; Kuze, Tatsuki; Gomez, Alfredo; Sultana, Sharifa

Publisher / Repository:: Association of Computing Machinery

Date Published:: 2025-01-10

Journal Name:: Proceedings of the ACM on Human-Computer Interaction - GROUP

Volume:: 9

Issue:: 1

ISSN:: 2573-0142

Page Range / eLocation ID:: 1 to 30

Subject(s) / Keyword(s):: topic models cultural analytics digital humanities computational social science text pre-processing

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Journal Article:
https://doi.org/10.1145/3701201

More Like this