An Infectious Disease Spread Simulation to Control Data Bias

Kong, Ruochen; Anderson, Taylor; Heslop, David; Zufle, Andreas

doi:10.1145/3678717.3691293

Citation Details

An Infectious Disease Spread Simulation to Control Data Bias

The increased availability of datasets during the COVID-19 pandemic enabled machine-learning approaches for modeling and forecasting infectious diseases. However, such approaches are known to amplify the bias in the data they are trained on. Bias in such input data like clinical case data for COVID-19 is difficult to measure due to disparities in testing availability, reporting standards, and healthcare access among different populations and regions. Furthermore, the way such biases may propagate through the modeling pipeline to decision-making is relatively unknown. Therefore, we present a system that leverages a highly detailed agent-based model (ABM) of infectious disease spread in a city to simulate the collection of biased clinical case data where the bias is known. Our system allows users to load either a pre-selected region or select their own (using OpenStreetMap data for the environment and census data for the population), specify population and infectious disease parameters, and the degree(s) to which different populations will be overrepresented or underrepresented in the case data. In addition to the system, we provide a large number of benchmark datasets that produce case data at different levels of bias for different regions. Wehope that infectious disease modelers will use these datasets to investigate how well their models are robust to data bias or whether their model is overfit to biased data. more »

Award ID(s):: 2302968 2302970 2109647

PAR ID:: 10578757

Author(s) / Creator(s):: Kong, Ruochen; Anderson, Taylor; Heslop, David; Zufle, Andreas

Publisher / Repository:: ACM

Date Published:: 2024-10-29

ISBN:: 9798400711077

Page Range / eLocation ID:: 681 to 684

Subject(s) / Keyword(s):: Data Simulation, Infectious Disease Data, Data Bias, Bias Simulation

Format(s):: Medium: X

Location:: Atlanta GA USA

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.1145/3678717.3691293

More Like this