skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Deng, Xinwei"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Agent-based models (ABMs) are used to simulate human-subject experiments. A comprehensive understanding of these human systems often requires executing large numbers of simulations, but these requirements are constrained by computational and other resources. In this work, we build a framework of digital twins for modeling human-subject experiments. The framework has three modules: ABMs of player behaviors built from game data; extensions of these models to represent virtual assistants (agents that are exogenously manipulated to create controlled environments for human agents); and an uncertainty quantification module composed of functional ANOVA and a Gaussian process-based emulator. The emulator is built from the extended ABM; we focus on emulator validation. By incorporating experimental data and agent-based simulation data, our proposed framework enhances the virtual representation of the dynamics in human-subject word formation experiments, which we consider a digital twin. Networked anagram experiments are used as an exemplar to demonstrate the methods. 
    more » « less
    Free, publicly-accessible full text available December 18, 2025
  2. Common knowledge (CK) is a phenomenon where a group of individuals each knows some collection of information, and, in essence, everyone knows that everyone knows the information. There are many applications involving CK, including business decision making, protests and rebellions, and online advertising. CK can lead to contagion and collective action but in ways that are fundamentally different from classic (e.g., Granovetter) threshold models used in the social sciences. Researchers developed CK models to enable the computation of contagion in networked populations. But these models have largely not been investigated using experiments with human subjects. In this work, we conduct a successive analysis of online CK experiments. We devise a flexible and interpretable statistical method to investigate the effects of significant factors, such as network structure and communication type. Among our findings, we demonstrate a phase change in group payout in the games that is caused by prohibiting player communication. 
    more » « less
    Free, publicly-accessible full text available September 3, 2025
  3. In a networked anagram game, players are provided letters with possible actions of requesting letters from their neighbours, replying to letter requests, or forming words. The objective is to form as many words as possible as a team. The experimental data show that behaviours among players can vary significantly. However, simulations using agent-based models (ABM) in the literature often have not incorporated proper uncertainty quantification methods to characterise diverse behaviours of players. In this work, we propose an uncertainty quantification framework to build, exercise, and evaluate agent behaviour models and simulations for networked group anagram games. Specifically, using the data of game experiments, the proposed framework considers the clustering of game players based on their performance to reflect players’ heterogeneity. Moreover, we also quantify uncertainty within each cluster through statistical modelling and inference. Numerical studies of networked game configurations are conducted to demonstrate the merits of the proposed framework. 
    more » « less
  4. Corlu, C. G.; Hunter, S. R.; Lam, H.; Onggo, B. S.; Shortle, J.; Biller, B. (Ed.)
    Experiments that are games played among a network of players are widely used to study human behavior. Furthermore, bots or intelligent systems can be used in these games to produce contexts that elicit particular types of human responses. Bot behaviors could be specified solely based on experimental data. In this work, we take a different perspective, called the Probability Calibration (PC) approach, to simulate networked group anagram games with certain players having bot-like behaviors. The proposed method starts with data-driven models and calibrates in principled ways the parameters that alter player behaviors. It can alter the performance of each type of agent (e.g., bot) in group anagram games. Further, statistical methods are used to test whether the PC models produce results that are statistically different from those of the original models. Case studies demonstrate the merits of the proposed method. 
    more » « less
  5. Common knowledge (CK) is a phenomenon where each individual within a group knows the same information and everyone knows that everyone knows the information, infinitely recursively. CK spreads information as a contagion through social networks in ways different from other models like susceptible-infectious-recovered (SIR) model. In a model of CK on Facebook, the biclique serves as the characterizing graph substructure for generating CK, as all nodes within a biclique share CK through their walls. To understand the effects of network structure on CK-based contagion, it is necessary to control the numbers and sizes of bicliques in networks. Thus, learning how to generate these CK networks (CKNs) is important. Consequently, we develop an exponential random graph model (ERGM) that constructs networks while controlling for bicliques. Our method offers powerful prediction and inference, reduces computational costs significantly, and has proven its merit in contagion dynamics through numerical experiments. 
    more » « less
  6. Health care–associated infections due to multidrug-resistant organisms (MDROs), such as methicillin-resistant Staphylococcus aureus (MRSA) and Clostridioides difficile (CDI), place a significant burden on our health care infrastructure. Screening for MDROs is an important mechanism for preventing spread but is resource intensive. The objective of this study was to develop automated tools that can predict colonization or infection risk using electronic health record (EHR) data, provide useful information to aid infection control, and guide empiric antibiotic coverage. We retrospectively developed a machine learning model to detect MRSA colonization and infection in undifferentiated patients at the time of sample collection from hospitalized patients at the University of Virginia Hospital. We used clinical and nonclinical features derived from on-admission and throughout-stay information from the patient’s EHR data to build the model. In addition, we used a class of features derived from contact networks in EHR data; these network features can capture patients’ contacts with providers and other patients, improving model interpretability and accuracy for predicting the outcome of surveillance tests for MRSA. Finally, we explored heterogeneous models for different patient subpopulations, for example, those admitted to an intensive care unit or emergency department or those with specific testing histories, which perform better. We found that the penalized logistic regression performs better than other methods, and this model’s performance measured in terms of its receiver operating characteristics-area under the curve score improves by nearly 11% when we use polynomial (second-degree) transformation of the features. Some significant features in predicting MDRO risk include antibiotic use, surgery, use of devices, dialysis, patient’s comorbidity conditions, and network features. Among these, network features add the most value and improve the model’s performance by at least 15%. The penalized logistic regression model with the same transformation of features also performs better than other models for specific patient subpopulations. Our study shows that MRSA risk prediction can be conducted quite effectively by machine learning methods using clinical and nonclinical features derived from EHR data. Network features are the most predictive and provide significant improvement over prior methods. Furthermore, heterogeneous prediction models for different patient subpopulations enhance the model’s performance. 
    more » « less
  7. When modeling human behavior in multi-player games, it is important to understand heterogeneous aspects of player behaviors. By leveraging experimental data and agent-based simulations, various data-driven modeling methods can be applied. This provides a great opportunity to quantify and visualize the uncertainty associated with these methods, allowing for a more comprehensive understanding of the individual and collective behaviors among players. For networked anagram games, player behaviors can be heterogeneous in terms of the number of words formed and the amount of cooperation among networked neighbors. Based on game data, these games can be modeled as discrete dynamical systems characterized by probabilistic state transitions. In this work, we present both Frequentist and Bayesian approaches for visualizing uncertainty in networked anagram games. These approaches help to elaborate how players individually and collectively form words by sharing letters with their neighbors in a network. Both approaches provide valuable insights into inferring the worst, average, and best player performance within and between behavioral clusters. Moreover, interesting contrasts between the Frequentist and Bayesian approaches can be observed. The knowledge and inferences gained from these approaches are incorporated into an agent-based simulation framework to further demonstrate model uncertainty and players’ heterogeneous behaviors. 
    more » « less
  8. Systems with both quantitative and qualitative responses are widely encountered in many applications. Design of experiment methods are needed when experiments are conducted to study such systems. Classic experimental design methods are unsuitable here because they often focus on one type of response. In this paper, we develop a Bayesian D-optimal design method for experiments with one continuous and one binary response. Both noninformative and conjugate informative prior distributions on the unknown parameters are considered. The proposed design criterion has meaningful interpretations regarding the D-optimality for the models for both types of responses. An efficient point-exchange search algorithm is developed to construct the local D-optimal designs for given parameter values. Global D-optimal designs are obtained by accumulating the frequencies of the design points in local D-optimal designs, where the parameters are sampled from the prior distributions. The performances of the proposed methods are evaluated through two examples. 
    more » « less