skip to main content

Title: AICov: An Integrative Deep Learning Framework for COVID-19 Forecasting with Population Covariates
The COVID-19 (COrona VIrus Disease 2019) pandemic has had profound global consequences on health, economic, social, behavioral, and almost every major aspect of human life. Therefore, it is of great importance to model COVID-19 and other pandemics in terms of the broader social contexts in which they take place. We present the architecture of an artificial intelligence enhanced COVID-19 analysis (in short AICov), which provides an integrative deep learning framework for COVID-19 forecasting with population covariates, some of which may serve as putative risk factors. We have integrated multiple different strategies into AICov, including the ability to use deep learning strategies based on Long Short-Term Memory (LSTM) and event modeling. To demonstrate our approach, we have introduced a framework that integrates population covariates from multiple sources. Thus, AICov not only includes data on COVID-19 cases and deaths but, more importantly, the population’s socioeconomic, health, and behavioral risk factors at their specific locations. The compiled data are fed into AICov, and thus we obtain improved prediction by the integration of the data to our model as compared to one that only uses case and death data. As we use deep learning our models adapt over time while learning the model from more » past data. « less
; ; ;
Award ID(s):
1829704 1918626 1835631 1443054 2151597
Publication Date:
Journal Name:
Journal of Data Science
Page Range or eLocation-ID:
293 to 313
Sponsoring Org:
National Science Foundation
More Like this
  1. We examine the uneven social and spatial distributions of COVID-19 and their relationships with indicators of social vulnerability in the U.S. epicenter, New York City (NYC). As of July 17th, 2020, NYC, despite having only 2.5% of the U.S. population, has [Formula: see text]6% of all confirmed cases, and [Formula: see text]16% of all deaths, making it a key learning ground for the social dynamics of the disease. Our analysis focuses on the multiple potential social, economic, and demographic drivers of disproportionate impacts in COVID-19 cases and deaths, as well as population rates of testing. Findings show that immediate impactsmore »of COVID-19 largely fall along lines of race and class. Indicators of poverty, race, disability, language isolation, rent burden, unemployment, lack of health insurance, and housing crowding all significantly drive spatial patterns in prevalence of COVID-19 testing, confirmed cases, death rates, and severity. Income in particular has a consistent negative relationship with rates of death and disease severity. The largest differences in social vulnerability indicators are also driven by populations of people of color, poverty, housing crowding, and rates of disability. Results highlight the need for targeted responses to address injustice of COVID-19 cases and deaths, importance of recovery strategies that account for differential vulnerability, and provide an analytical approach for advancing research to examine potential similar injustice of COVID-19 in other U.S. cities. Significance Statement Communities around the world have variable success in mitigating the social impacts of COVID-19, with many urban areas being hit particularly hard. Analysis of social vulnerability to COVID-19 in the NYC, the U.S. national epicenter, shows strongly disproportionate impacts of the pandemic on low income populations and communities of color. Results highlight the class and racial inequities of the coronavirus pandemic in NYC, and the need to unpack the drivers of social vulnerability. To that aim, we provide a replicable framework for examining patterns of uneven social vulnerability to COVID-19- using publicly available data which can be readily applied in other study regions, especially within the U.S.A. This study is important to inform public and policy debate over strategies for short- and long-term responses that address the injustice of disproportionate impacts of COVID-19. Although similar studies examining social vulnerability and equity dimensions of the COVID-19 outbreak in cities across the U.S. have been conducted (Cordes and Castro 2020, Kim and Bostwick 2002, Gaynor and Wilson 2020; Wang et al. 2020; Choi and Unwin 2020), this study provides a more comprehensive analysis in NYC that extends previous contributions to use the highest resolution spatial units for data aggregation (ZCTAs). We also include mortality and severity rates as key indicators and provide a replicable framework that draws from the Centers for Disease Control and Prevention’s Social Vulnerability indicators for communities in NYC.« less
  2. Abstract Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) the causal agent for COVID-19, is a communicable disease spread through close contact. It is known to disproportionately impact certain communities due to both biological susceptibility and inequitable exposure. In this study, we investigate the most important health, social, and environmental factors impacting the early phases (before July, 2020) of per capita COVID-19 transmission and per capita all-cause mortality in US counties. We aggregate county-level physical and mental health, environmental pollution, access to health care, demographic characteristics, vulnerable population scores, and other epidemiological data to create a large feature set to analyzemore »per capita COVID-19 outcomes. Because of the high-dimensionality, multicollinearity, and unknown interactions of the data, we use ensemble machine learning and marginal prediction methods to identify the most salient factors associated with several COVID-19 outbreak measure. Our variable importance results show that measures of ethnicity, public transportation and preventable diseases are the strongest predictors for both per capita COVID-19 incidence and mortality. Specifically, the CDC measures for minority populations, CDC measures for limited English, and proportion of Black- and/or African-American individuals in a county were the most important features for per capita COVID-19 cases within a month after the pandemic started in a county and also at the latest date examined. For per capita all-cause mortality at day 100 and total to date, we find that public transportation use and proportion of Black- and/or African-American individuals in a county are the strongest predictors. The methods predict that, keeping all other factors fixed, a 10% increase in public transportation use, all other factors remaining fixed at the observed values, is associated with increases mortality at day 100 of 2012 individuals (95% CI [1972, 2356]) and likewise a 10% increase in the proportion of Black- and/or African-American individuals in a county is associated with increases total deaths at end of study of 2067 (95% CI [1189, 2654]). Using data until the end of study, the same metric suggests ethnicity has double the association as the next most important factors, which are location, disease prevalence, and transit factors. Our findings shed light on societal patterns that have been reported and experienced in the U.S. by using robust methods to understand the features most responsible for transmission and sectors of society most vulnerable to infection and mortality. In particular, our results provide evidence of the disproportionate impact of the COVID-19 pandemic on minority populations. Our results suggest that mitigation measures, including how vaccines are distributed, could have the greatest impact if they are given with priority to the highest risk communities.« less
  3. Abstract Background The COVID-19 pandemic has caused more than 25 million cases and 800 thousand deaths worldwide to date. In early days of the pandemic, neither vaccines nor therapeutic drugs were available for this novel coronavirus. All measures to prevent the spread of COVID-19 are thus based on reducing contact between infected and susceptible individuals. Most of these measures such as quarantine and self-isolation require voluntary compliance by the population. However, humans may act in their (perceived) self-interest only. Methods We construct a mathematical model of COVID-19 transmission with quarantine and hospitalization coupled with a dynamic game model of adaptivemore »human behavior. Susceptible and infected individuals adopt various behavioral strategies based on perceived prevalence and burden of the disease and sensitivity to isolation measures, and they evolve their strategies using a social learning algorithm (imitation dynamics). Results This results in complex interplay between the epidemiological model, which affects success of different strategies, and the game-theoretic behavioral model, which in turn affects the spread of the disease. We found that the second wave of the pandemic, which has been observed in the US, can be attributed to rational behavior of susceptible individuals, and that multiple waves of the pandemic are possible if the rate of social learning of infected individuals is sufficiently high. Conclusions To reduce the burden of the disease on the society, it is necessary to incentivize such altruistic behavior by infected individuals as voluntary self-isolation.« less
  4. Abstract Optimizing the impact on the economy of control strategies aiming at containing the spread of COVID-19 is a critical challenge. We use daily new case counts of COVID-19 patients reported by local health administrations from different Metropolitan Statistical Areas (MSAs) within the US to parametrize a model that well describes the propagation of the disease in each area. We then introduce a time-varying control input that represents the level of social distancing imposed on the population of a given area and solve an optimal control problem with the goal of minimizing the impact of social distancing on the economymore »in the presence of relevant constraints, such as a desired level of suppression for the epidemics at a terminal time. We find that with the exception of the initial time and of the final time, the optimal control input is well approximated by a constant, specific to each area, which contrasts with the implemented system of reopening ‘in phases’. For all the areas considered, this optimal level corresponds to stricter social distancing than the level estimated from data. Proper selection of the time period for application of the control action optimally is important: depending on the particular MSA this period should be either short or long or intermediate. We also consider the case that the transmissibility increases in time (due e.g. to increasingly colder weather), for which we find that the optimal control solution yields progressively stricter measures of social distancing. We finally compute the optimal control solution for a model modified to incorporate the effects of vaccinations on the population and we see that depending on a number of factors, social distancing measures could be optimally reduced during the period over which vaccines are administered to the population.« less
  5. The COVID-19 preparedness plans by the Centers for Disease Control and Prevention strongly underscores the need for efficient and effective testing strategies. This, in turn, calls upon the design and development of statistical sampling and testing of COVID-19 strategies. However, the evaluation of operational details requires a detailed representation of human behaviors in epidemic simulation models. Traditional epidemic simulations are mainly based upon system dynamic models, which use differential equations to study macro-level and aggregated behaviors of population subgroups. As such, individual behaviors (e.g., personal protection, commute conditions, social patterns) can’t be adequately modeled and tracked for the evaluation ofmore »health policies and action strategies. Therefore, this paper presents a network-based simulation model to optimize COVID-19 testing strategies for effective identifications of virus carriers in a spatial area. Specifically, we design a data-driven risk scoring system for statistical sampling and testing of COVID-19. This system collects real-time data from simulated networked behaviors of individuals in the spatial network to support decision-making during the virus spread process. Experimental results showed that this framework has superior performance in optimizing COVID-19 testing decisions and effectively identifying virus carriers from the population.« less