skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Award ID contains: 2243941

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Practitioners dealing with large text collections frequently use topic models such as Latent Dirichlet Allocation (LDA) and Non-negative Matrix Factorization (NMF) in their projects to explore trends. Despite twenty years of accrued advancement in natural language processing tools, these models are found to be slow and challenging to apply to text exploration projects. In our work, we engaged with practitioners (n=15) who use topic modeling to explore trends in large text collections to understand their project workflows and investigate which factors often slow down the processes and how they deal with such errors and interruptions in automated topic modeling. Our findings show that practitioners are required to diagnose and resolve context-specific problems with preparing data and models and need control for these steps, especially for data cleaning and parameter selection. Our major findings resonate with existing work across CSCW, computational social science, machine learning, data science, and digital humanities. They also leave us questioning whether automation is actually a useful goal for tools designed for topic models and text exploration. 
    more » « less
    Free, publicly-accessible full text available January 10, 2026
  2. Free, publicly-accessible full text available December 4, 2025
  3. Abductive reasoning is ubiquitous in artificial intelligence and everyday thinking. However, formal theories that provide probabilistic guarantees for abductive inference are lacking. We present a quantitative formalization of abductive logic that combines Bayesian probability with the interpretation of abduction as a search process within the Algorithmic Search Framework (ASF). By incorporating uncertainty in background knowledge, we establish two novel sets of probabilistic bounds on the success of abduction when (1) selecting the single most likely cause while assuming noiseless observations, and (2) selecting any cause above some probability threshold while accounting for noisy observations. To our knowledge, no existing abductive or general inference bounds account for noisy observations. Furthermore, while most existing abductive frameworks assume exact underlying prior and likelihood distributions, we assume only percentile-based confidence intervals for such values. These milder assumptions result in greater flexibility and applicability of our framework. We also explore additional information-theoretic results from the ASF and provide mathematical justifications for everyday abductive intuitions. 
    more » « less
    Free, publicly-accessible full text available November 20, 2025
  4. Ana Paula Rocha, Luc Steels (Ed.)
    Many machine learning tasks have a measure of success that is naturally continuous, such as error under a loss function. We generalize the Algorithmic Search Framework (ASF), used for modeling machine learning domains as discrete search problems, to the continuous space. Moving from discrete target sets to a continuous measure of success extends the applicability of the ASF by allowing us to model fundamentally continuous notions like fuzzy membership. We generalize many results from the discrete ASF to the continuous space and prove novel results for a continuous measure of success. Additionally, we derive an upper bound for the expected performance of a search algorithm under arbitrary levels of quantization in the success measure, demonstrating a negative relationship between quantization and the performance upper bound. These results improve the fidelity of the ASF as a framework for modeling a range of machine learning and artificial intelligence tasks. 
    more » « less