Abstract The early detection of the coronavirus disease 2019 (COVID-19) outbreak is important to save people’s lives and restart the economy quickly and safely. People’s social behavior, reflected in their mobility data, plays a major role in spreading the disease. Therefore, we used the daily mobility data aggregated at the county level beside COVID-19 statistics and demographic information for short-term forecasting of COVID-19 outbreaks in the United States. The daily data are fed to a deep learning model based on Long Short-Term Memory (LSTM) to predict the accumulated number of COVID-19 cases in the next two weeks. A significant average correlation was achieved ( r =0.83 ( p = 0.005 )) between the model predicted and actual accumulated cases in the interval from August 1, 2020 until January 22, 2021. The model predictions had r > 0.7 for 87% of the counties across the United States. A lower correlation was reported for the counties with total cases of <1000 during the test interval. The average mean absolute error (MAE) was 605.4 and decreased with a decrease in the total number of cases during the testing interval. The model was able to capture the effect of government responses on COVID-19 cases. Also, it was able to capture the effect of age demographics on the COVID-19 spread. It showed that the average daily cases decreased with a decrease in the retiree percentage and increased with an increase in the young percentage. Lessons learned from this study not only can help with managing the COVID-19 pandemic but also can help with early and effective management of possible future pandemics. The code used for this study was made publicly available on https://github.com/Murtadha44/covid-19-spread-risk.
more »
« less
Predictions, Role of Interventions and Effects of a Historic National Lockdown in India's Response to the the COVID-19 Pandemic: Data Science Call to Arms
With only 536 COVID-19 cases and 11 fatalities, India took the historic decision of a 21-day national lockdown on March 25, 2020. The lockdown was first extended to May 3 soon after the analysis of this article was completed, and then to May 18 while this article was being revised. In this article, we use a Bayesian extension of the susceptible-infected-removed (eSIR) model designed for intervention forecasting to study the short- and long-term impact of an initial 21-day lockdown on the total number of COVID-19 infections in India compared to other, less severe nonpharmaceutical interventions. We compare effects of hypothetical durations of lockdown on reducing the number of active and new infections. We find that the lockdown, if implemented correctly, can reduce the total number of cases in the short term, and buy India invaluable time to prepare its health care and disease-monitoring system. Our analysis shows we need to have some measures of suppression in place after the lockdown for increased benefit (as measured by reduction in the number of cases). A longer lockdown from 42–56 days is preferable to substantially ‘flatten the curve’ when compared to 21–28 days of lockdown. Our models focus solely on projecting the number of COVID-19 infections and thus inform policymakers about one aspect of this multifaceted decision-making problem. We conclude with a discussion on the pivotal role of increased testing, reliable and transparent data, proper uncertainty quantification, accurate interpretation of forecasting models, reproducible data science methods, and tools that can enable data-driven policymaking during a pandemic. Our software products are available at covind19.org.
more »
« less
- Award ID(s):
- 1712933
- PAR ID:
- 10169282
- Date Published:
- Journal Name:
- Harvard Data Science Review
- Issue:
- Special Issue 1-COVID-19
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
null (Ed.)Background The natural history of disease in patients infected with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) remained obscure during the early pandemic. Aim Our objective was to estimate epidemiological parameters of coronavirus disease (COVID-19) and assess the relative infectivity of the incubation period. Methods We estimated the distributions of four epidemiological parameters of SARS-CoV-2 transmission using a large database of COVID-19 cases and potential transmission pairs of cases, and assessed their heterogeneity by demographics, epidemic phase and geographical region. We further calculated the time of peak infectivity and quantified the proportion of secondary infections during the incubation period. Results The median incubation period was 7.2 (95% confidence interval (CI): 6.9‒7.5) days. The median serial and generation intervals were similar, 4.7 (95% CI: 4.2‒5.3) and 4.6 (95% CI: 4.2‒5.1) days, respectively. Paediatric cases < 18 years had a longer incubation period than adult age groups (p = 0.007). The median incubation period increased from 4.4 days before 25 January to 11.5 days after 31 January (p < 0.001), whereas the median serial (generation) interval contracted from 5.9 (4.8) days before 25 January to 3.4 (3.7) days after. The median time from symptom onset to discharge was also shortened from 18.3 before 22 January to 14.1 days after. Peak infectivity occurred 1 day before symptom onset on average, and the incubation period accounted for 70% of transmission. Conclusion The high infectivity during the incubation period led to short generation and serial intervals, necessitating aggressive control measures such as early case finding and quarantine of close contacts.more » « less
-
null (Ed.)Background: A key challenge in estimating epidemiological parameters for a pandemic such as the initial COVID-19 outbreak in Wuhan is the discrepancy between the officially reported number of infections and the true number of infections. A common approach to tackling the challenge is to use the number of infections exported from the originating city to infer the true number. This approach can only provide a static estimate of the epidemiological parameters before city lockdown because there are almost no exported cases thereafter.Methods: We propose a Bayesian estimation method that dynamically estimates the epidemiological parameters by recovering true numbers of infections from day-to-day official numbers. To illustrate the use of this method, we provide a comprehensive retrospection on how the COVID-19 had progressed in Wuhan from January 19 to March 5, 2020. Particularly, we estimate that the outbreak sizes by January 23 and March 5 were 11,239 [95% CI 4,794–22,372] and 124,506 [95% CI 69,526–265,113], respectively.Results: The effective reproduction number attained its maximum on January 24 (3.42 [95% CI 3.34–3.50]) and became less than 1 from February 7 (0.76 [95% CI 0.65–0.92]). We also estimate the effects of two major government interventions on the spread of COVID-19 in Wuhan.Conclusions: This case study by our proposed method affirms the believed importance and effectiveness of imposing tight non-essential travel restrictions and affirm the importance and effectiveness of government interventions (e.g., transportation suspension and large scale hospitalization) for effective mitigation of COVID-19 community spread.more » « less
-
We propose a modified population-based susceptible-exposed-infectious-recovered (SEIR) compartmental model for a retrospective study of the COVID-19 transmission dynamics in India during the first wave. We extend the conventional SEIR methodology to account for the complexities of COVID-19 infection, its multiple symptoms, and transmission pathways. In particular, we consider a time-dependent transmission rate to account for governmental controls (e.g., national lockdown) and individual behavioral factors (e.g., social distancing, mask-wearing, personal hygiene, and self-quarantine). An essential feature of COVID-19 that is different from other infections is the significant contribution of asymptomatic and pre-symptomatic cases to the transmission cycle. A Bayesian method is used to calibrate the proposed SEIR model using publicly available data (daily new tested positive, death, and recovery cases) from several Indian states. The uncertainty of the parameters is naturally expressed as the posterior probability distribution. The calibrated model is used to estimate undetected cases and study different initial intervention policies, screening rates, and public behavior factors, that can potentially strike a balance between disease control and the humanitarian crisis caused by a sudden strict lockdown.more » « less
-
Abstract Background The COVID-19 outbreak in Wuhan started in December 2019 and was under control by the end of March 2020 with a total of 50,006 confirmed cases by the implementation of a series of nonpharmaceutical interventions (NPIs) including unprecedented lockdown of the city. This study analyzes the complete outbreak data from Wuhan, assesses the impact of these public health interventions, and estimates the asymptomatic, undetected and total cases for the COVID-19 outbreak in Wuhan. Methods By taking different stages of the outbreak into account, we developed a time-dependent compartmental model to describe the dynamics of disease transmission and case detection and reporting. Model coefficients were parameterized by using the reported cases and following key events and escalated control strategies. Then the model was used to calibrate the complete outbreak data by using the Monte Carlo Markov Chain (MCMC) method. Finally we used the model to estimate asymptomatic and undetected cases and approximate the overall antibody prevalence level. Results We found that the transmission rate between Jan 24 and Feb 1, 2020, was twice as large as that before the lockdown on Jan 23 and 67.6 % (95% CI [0.584,0.759]) of detectable infections occurred during this period. Based on the reported estimates that around 20% of infections were asymptomatic and their transmission ability was about 70% of symptomatic ones, we estimated that there were about 14,448 asymptomatic and undetected cases (95% CI [12,364,23,254]), which yields an estimate of a total of 64,454 infected cases (95% CI [62,370,73,260]), and the overall antibody prevalence level in the population of Wuhan was 0.745% (95% CI [0.693 % ,0.814 % ]) by March 31, 2020. Conclusions We conclude that the control of the COVID-19 outbreak in Wuhan was achieved via the enforcement of a combination of multiple NPIs: the lockdown on Jan 23, the stay-at-home order on Feb 2, the massive isolation of all symptomatic individuals via newly constructed special shelter hospitals on Feb 6, and the large scale screening process on Feb 18. Our results indicate that the population in Wuhan is far away from establishing herd immunity and provide insights for other affected countries and regions in designing control strategies and planing vaccination programs.more » « less
An official website of the United States government

