The COVID-19 pandemic represents the most significant public health disaster since the 1918 influenza pandemic. During pandemics such as COVID-19, timely and reliable spatiotemporal forecasting of epidemic dynamics is crucial. Deep learning-based time series models for forecasting have recently gained popularity and have been successfully used for epidemic forecasting. Here we focus on the design and analysis of deep learning-based models for COVID-19 forecasting. We implement multiple recurrent neural network-based deep learning models and combine them using the stacking ensemble technique. In order to incorporate the effects of multiple factors in COVID-19 spread, we consider multiple sources such as COVID-19 confirmed and death case count data and testing data for better predictions. To overcome the sparsity of training data and to address the dynamic correlation of the disease, we propose clustering-based training for high-resolution forecasting. The methods help us to identify the similar trends of certain groups of regions due to various spatio-temporal effects. We examine the proposed method for forecasting weekly COVID-19 new confirmed cases at county-, state-, and country-level. A comprehensive comparison between different time series models in COVID-19 context is conducted and analyzed. The results show that simple deep learning models can achieve comparable or better performance when compared with more complicated models. We are currently integrating our methods as a part of our weekly forecasts that we provide state and federal authorities.
more »
« less
Using Demographic Pattern Analysis to Predict COVID-19 Fatalities on the US County Level
Unlike pandemics in the past, COVID-19 has hit us in the midst of the information age. We have built vast capabilities to collect and store data of any kind that can be analyzed in myriad ways to help us mitigate the impact of this catastrophic disease. Specifically for COVID-19, data analysis can help local governments to plan the allocation of testing kits, testing stations, and primary care units, and it can help them in setting guidelines for residents, such as the need for social distancing, the use of face masks, and when to open local businesses that enable human contact. Further, it can also lead to a better understanding of pandemics in general and so inform policy makers on the regional and national level. All of this can save both cost and lives. In this article, we show the results of an ongoing study we conducted using a prominent regularly updated dataset. We used a pattern mining engine we developed to find specific characteristics of US counties that appear to expose them to higher COVID-19 mortality. Furthermore, we also show that these characteristics can be used to predict future COVID-19 mortality.
more »
« less
- Award ID(s):
- 1926949
- PAR ID:
- 10298875
- Date Published:
- Journal Name:
- Digital Government: Research and Practice
- Volume:
- 2
- Issue:
- 1
- ISSN:
- 2691-199X
- Page Range / eLocation ID:
- 1 to 11
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Acharya, Binod (Ed.)This study compares pandemic experiences of Missouri’s 115 counties based on rurality and sociodemographic characteristics during the 1918–20 influenza and 2020–21 COVID-19 pandemics. The state’s counties and overall population distribution have remained relatively stable over the last century, which enables identification of long-lasting pandemic attributes. Sociodemographic data available at the county level for both time periods were taken from U.S. census data and used to create clusters of similar counties. Counties were also grouped by rural status (RSU), including fully (100%) rural, semirural (1–49% living in urban areas), and urban (>50% of the population living in urban areas). Deaths from 1918 through 1920 were collated from the Missouri Digital Heritage database and COVID-19 cases and deaths were downloaded from the Missouri COVID-19 dashboard. Results from sociodemographic analyses indicate that, during both time periods, average farm value, proportion White, and literacy were the most important determinants of sociodemographic clusters. Furthermore, the Urban/Central and Southeastern regions experienced higher mortality during both pandemics than did the North and South. Analyses comparing county groups by rurality indicated that throughout the 1918–20 influenza pandemic, urban counties had the highest and rural had the lowest mortality rates. Early in the 2020–21 COVID-19 pandemic, urban counties saw the most extensive epidemic spread and highest mortality, but as the epidemic progressed, cumulative mortality became highest in semirural counties. Additional results highlight the greater effects both pandemics had on county groups with lower rates of education and a lower proportion of Whites in the population. This was especially true for the far southeastern counties of Missouri (“the Bootheel”) during the COVID-19 pandemic. These results indicate that rural-urban and socioeconomic differences in health outcomes are long-standing problems that continue to be of significant importance, even though the overall quality of health care is substantially better in the 21 st century.more » « less
-
Abstract The early detection of the coronavirus disease 2019 (COVID-19) outbreak is important to save people’s lives and restart the economy quickly and safely. People’s social behavior, reflected in their mobility data, plays a major role in spreading the disease. Therefore, we used the daily mobility data aggregated at the county level beside COVID-19 statistics and demographic information for short-term forecasting of COVID-19 outbreaks in the United States. The daily data are fed to a deep learning model based on Long Short-Term Memory (LSTM) to predict the accumulated number of COVID-19 cases in the next two weeks. A significant average correlation was achieved ( r =0.83 ( p = 0.005 )) between the model predicted and actual accumulated cases in the interval from August 1, 2020 until January 22, 2021. The model predictions had r > 0.7 for 87% of the counties across the United States. A lower correlation was reported for the counties with total cases of <1000 during the test interval. The average mean absolute error (MAE) was 605.4 and decreased with a decrease in the total number of cases during the testing interval. The model was able to capture the effect of government responses on COVID-19 cases. Also, it was able to capture the effect of age demographics on the COVID-19 spread. It showed that the average daily cases decreased with a decrease in the retiree percentage and increased with an increase in the young percentage. Lessons learned from this study not only can help with managing the COVID-19 pandemic but also can help with early and effective management of possible future pandemics. The code used for this study was made publicly available on https://github.com/Murtadha44/covid-19-spread-risk.more » « less
-
The United States struggled exceptionally during the COVID-19 pandemic. For researchers and policymakers, it is of great interest to understand the risk factors associated with COVID-19 when examining data aggregated at a regional level. We examined the county-level association between the reported COVID-19 case fatality rate (CFR) and various demographic, socioeconomic and health factors in two hard-hit US states: New York and Florida. In particular, we examined the changes over time in the association patterns. For each state, we divided the data into three seasonal phases based on observed waves of the COVID-19 outbreak. For each phase, we used tests of correlations to explore the marginal association between each potential covariate and the reported CFR. We used graphical models to further clarify direct or indirect associations in a multivariate setting. We found that during the early phase of the pandemic, the association patterns were complex: the reported CFRs were high, with great variation among counties. As pandemics progressed, especially during the winter phase, socioeconomic factors such as median household income and health-related factors such as the prevalence of adult smokers and mortality rate of respiratory diseases became more significantly associated with the CFR. It is remarkable that common risk factors were identified for both states.more » « less
-
Turner, Richard (Ed.)Background With the availability of multiple Coronavirus Disease 2019 (COVID-19) vaccines and the predicted shortages in supply for the near future, it is necessary to allocate vaccines in a manner that minimizes severe outcomes, particularly deaths. To date, vaccination strategies in the United States have focused on individual characteristics such as age and occupation. Here, we assess the utility of population-level health and socioeconomic indicators as additional criteria for geographical allocation of vaccines. Methods and findings County-level estimates of 14 indicators associated with COVID-19 mortality were extracted from public data sources. Effect estimates of the individual indicators were calculated with univariate models. Presence of spatial autocorrelation was established using Moran’s I statistic. Spatial simultaneous autoregressive (SAR) models that account for spatial autocorrelation in response and predictors were used to assess (i) the proportion of variance in county-level COVID-19 mortality that can explained by identified health/socioeconomic indicators (R 2 ); and (ii) effect estimates of each predictor. Adjusting for case rates, the selected indicators individually explain 24%–29% of the variability in mortality. Prevalence of chronic kidney disease and proportion of population residing in nursing homes have the highest R 2 . Mortality is estimated to increase by 43 per thousand residents (95% CI: 37–49; p < 0.001) with a 1% increase in the prevalence of chronic kidney disease and by 39 deaths per thousand (95% CI: 34–44; p < 0.001) with 1% increase in population living in nursing homes. SAR models using multiple health/socioeconomic indicators explain 43% of the variability in COVID-19 mortality in US counties, adjusting for case rates. R 2 was found to be not sensitive to the choice of SAR model form. Study limitations include the use of mortality rates that are not age standardized, a spatial adjacency matrix that does not capture human flows among counties, and insufficient accounting for interaction among predictors. Conclusions Significant spatial autocorrelation exists in COVID-19 mortality in the US, and population health/socioeconomic indicators account for a considerable variability in county-level mortality. In the context of vaccine rollout in the US and globally, national and subnational estimates of burden of disease could inform optimal geographical allocation of vaccines.more » « less
An official website of the United States government

