skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Data analysis and modeling pipelines for controlled networked social science experiments
There is large interest in networked social science experiments for understanding human behavior at-scale. Significant effort is required to perform data analytics on experimental outputs and for computational modeling of custom experiments. Moreover, experiments and modeling are often performed in a cycle, enabling iterative experimental refinement and data modeling to uncover interesting insights and to generate/refute hypotheses about social behaviors. The current practice for social analysts is to develop tailor-made computer programs and analytical scripts for experiments and modeling. This often leads to inefficiencies and duplication of effort. In this work, we propose a pipeline framework to take a significant step towards overcoming these challenges. Our contribution is to describe the design and implementation of a software system to automate many of the steps involved in analyzing social science experimental data, building models to capture the behavior of human subjects, and providing data to test hypotheses. The proposed pipeline framework consists of formal models, formal algorithms, and theoretical models as the basis for the design and implementation. We propose a formal data model, such that if an experiment can be described in terms of this model, then our pipeline software can be used to analyze data efficiently. The merits of the proposed pipeline framework is elaborated by several case studies of networked social science experiments.  more » « less
Award ID(s):
1916670
PAR ID:
10203865
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ; ; ; ; ; ;
Date Published:
Journal Name:
PloS one
Volume:
15
ISSN:
1932-6203
Page Range / eLocation ID:
1-58
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Agent-based models (ABMs) are used to simulate human-subject experiments. A comprehensive understanding of these human systems often requires executing large numbers of simulations, but these requirements are constrained by computational and other resources. In this work, we build a framework of digital twins for modeling human-subject experiments. The framework has three modules: ABMs of player behaviors built from game data; extensions of these models to represent virtual assistants (agents that are exogenously manipulated to create controlled environments for human agents); and an uncertainty quantification module composed of functional ANOVA and a Gaussian process-based emulator. The emulator is built from the extended ABM; we focus on emulator validation. By incorporating experimental data and agent-based simulation data, our proposed framework enhances the virtual representation of the dynamics in human-subject word formation experiments, which we consider a digital twin. Networked anagram experiments are used as an exemplar to demonstrate the methods. 
    more » « less
  2. In a networked anagram game, players are provided letters with possible actions of requesting letters from their neighbours, replying to letter requests, or forming words. The objective is to form as many words as possible as a team. The experimental data show that behaviours among players can vary significantly. However, simulations using agent-based models (ABM) in the literature often have not incorporated proper uncertainty quantification methods to characterise diverse behaviours of players. In this work, we propose an uncertainty quantification framework to build, exercise, and evaluate agent behaviour models and simulations for networked group anagram games. Specifically, using the data of game experiments, the proposed framework considers the clustering of game players based on their performance to reflect players’ heterogeneity. Moreover, we also quantify uncertainty within each cluster through statistical modelling and inference. Numerical studies of networked game configurations are conducted to demonstrate the merits of the proposed framework. 
    more » « less
  3. null (Ed.)
    In anagram games, players are provided with letters for forming as many words as possible over a specified time duration. Anagram games have been used in controlled experiments to study problems such as collective identity, effects of goal setting, internal-external attributions, test anxiety, and others. The majority of work on anagram games involves individual players. Recently, work has expanded to group anagram games where players cooperate by sharing letters. In this work, we analyze experimental data from online social networked experiments of group anagram games. We develop mechanistic and data driven models of human decision-making to predict detailed game player actions (e.g., what word to form next). With these results, we develop a composite agent-based modeling and simulation platform that incorporates the models from data analysis. We compare model predictions against experimental data, which enables us to provide explanations of human decision-making and behavior. Finally, we provide illustrative case studies using agent-based simulations to demonstrate the efficacy of models to provide insights that are beyond those from experiments alone. 
    more » « less
  4. The computational modeling of groups requires models that connect micro-level with macro-level processes and outcomes. Recent research in computational social science has started from simple models of human behavior, and attempted to link to social structures. However, these models make simplifying assumptions about human understanding of culture that are often not realistic and may be limiting in their generality. In this paper, we present work on Bayesian affect control theory as a more comprehensive, yet highly parsimonious model that integrates artificial intelligence, social psychology, and emotions into a single predictive model of human activities in groups. We illustrate these developments with examples from an ongoing research project aimed at computational analysis of virtual software development teams. 
    more » « less
  5. Cells interact as dynamically evolving ecosystems. While recent single-cell and spatial multi-omics technologies quantify individual cell characteristics, predicting their evolution requires mathematical modeling. We propose a conceptual framework—a cell behavior hypothesis grammar—that uses natural language statements (cell rules) to create mathematical models. This enables systematic integration of biological knowledge and multi-omics data to generate in silico models, enabling virtual “thought experiments” that test and expand our understanding of multicellular systems and generate new testable hypotheses. This paper motivates and describes the grammar, offers a reference implementation, and demonstrates its use in developing both de novo mechanistic models and those informed by multi-omics data. We show its potential through examples in cancer and its broader applicability in simulating brain development. This approach bridges biological, clinical, and systems biology research for mathematical modeling at scale, allowing the community to predict emergent multicellular behavior. 
    more » « less