The ability to quickly learn fundamentals about a new infectious disease, such as how it is transmitted, the incubation period, and related symptoms, is crucial in any novel pandemic. For instance, rapid identification of symptoms can enable interventions for dampening the spread of the disease. Traditionally, symptoms are learned from research publications associated with clinical studies. However, clinical studies are often slow and time intensive, and hence delays can have dire consequences in a rapidly spreading pandemic like we have seen with COVID-19. In this article, we introduce SymptomID, a modular artificial intelligence–based framework for rapid identification of symptoms associated with novel pandemics using publicly available news reports. SymptomID is built using the state-of-the-art natural language processing model (Bidirectional Encoder Representations for Transformers) to extract symptoms from publicly available news reports and cluster-related symptoms together to remove redundancy. Our proposed framework requires minimal training data, because it builds on a pre-trained language model. In this study, we present a case study of SymptomID using news articles about the current COVID-19 pandemic. Our COVID-19 symptom extraction module, trained on 225 articles, achieves an F1 score of over 0.8. SymptomID can correctly identify well-established symptoms (e.g., “fever” and “cough”) and less-prevalent symptomsmore »
A semi-parametric, state-space compartmental model with time-dependent parameters for forecasting COVID-19 cases, hospitalizations and deaths
Short-term forecasts of the dynamics of coronavirus disease 2019 (COVID-19) in the period up to its decline following mass vaccination was a task that received much attention but proved difficult to do with high accuracy. However, the availability of standardized forecasts and versioned datasets from this period allows for continued work in this area. Here, we introduce the Gaussian infection state space with time dependence (GISST) forecasting model. We evaluate its performance in one to four weeks ahead forecasts of COVID-19 cases, hospital admissions and deaths in the state of California made with official reports of COVID-19, Google’s mobility reports and vaccination data available each week. Evaluation of these forecasts with a weighted interval score shows them to consistently outperform a naive baseline forecast and often score closer to or better than a high-performing ensemble forecaster. The GISST model also provides parameter estimates for a compartmental model of COVID-19 dynamics, includes a regression submodel for the transmission rate and allows for parameters to vary over time according to a random walk. GISST provides a novel, balanced combination of computational efficiency, model interpretability and applicability to large multivariate datasets that may prove useful in improving the accuracy of infectious disease forecasts.
- Award ID(s):
- Publication Date:
- NSF-PAR ID:
- Journal Name:
- Journal of The Royal Society Interface
- Sponsoring Org:
- National Science Foundation
More Like this
Abstract The rapid rollout of the COVID-19 vaccine raises the question of whether and when the ongoing pandemic could be eliminated with vaccination and non-pharmaceutical interventions (NPIs). Despite advances in the impact of NPIs and the conceptual belief that NPIs and vaccination control COVID-19 infections, we lack evidence to employ control theory in real-world social human dynamics in the context of disease spreading. We bridge the gap by developing a new analytical framework that treats COVID-19 as a feedback control system with the NPIs and vaccination as the controllers and a computational model that maps human social behaviors into input signals. This approach enables us to effectively predict the epidemic spreading in 381 Metropolitan statistical areas (MSAs) in the US by learning our model parameters utilizing the time series NPIs (i.e., the stay-at-home order, face-mask wearing, and testing) data. This model allows us to optimally identify three NPIs to predict infections accurately in 381 MSAs and avoid over-fitting. Our numerical results demonstrate our approach’s excellent predictive power with R 2 > 0.9 for all the MSAs regardless of their sizes, locations, and demographic status. Our methodology allows us to estimate the needed vaccine coverage and NPIs for achieving R e tomore »
Despite the significant progress in the development of vaccines, the COVID-19 pandemic still poses difficulty for its control because of many obstacles such as the proper implementation of vaccination, public hesitancy towards vaccines, dropping out from the second dose, and varying level of protection after the first and the second doses. In this study, we develop a novel mathematical model of COVID-19 transmission, including two separate vaccinated compartments (first dose and both doses). We parametrize and validate our model using data from Dougherty county of Georgia, USA, one of the most affected counties, where the transmission trend clearly is associated with various policies and public events. We analyze our model for stability of equilibria and persistence of the disease, and formulate expression for reproduction numbers. We estimate that the basic reproduction number in Dougherty county is 1.69, and the effective reproduction number during the study period ranges from 0.26 to 6.36. The number of daily undiagnosed cases peaked at 310 per day, resulting in the maximum number of active infectious individuals to be 2471. Our model predicts that in a high transmission scenario, the vaccination strategies should be combined with other non-pharmaceutical prevention strategies to ensure transmission control. Moreover, ourmore »
High resolution mobility datasets have become increasingly available in the past few years and have enabled detailed models for infectious disease spread including those for COVID-19. However, there are open questions on how such a mobility data can be used effectively within epidemic models and for which tasks they are best suited. In this paper, we extract a number of graph-based proximity metrics from high resolution cellphone trace data from X-Mode and use it to study COVID-19 epidemic spread in 50 land grant university counties in the US. We present an approach to estimate the effect of mobility on cases by fitting an ODE based model and performing multivariate linear regression to explain the estimated time varying transmissibility. We find that, while mobility plays a significant role, the contribution is heterogeneous across the counties, as exemplified by a subsequent correlation analysis. We subsequently evaluate the metrics’ utility for case surge prediction defined as a supervised classification problem, and show that the learnt model can predict surges with 95% accuracy and 87% F1-score.
Estimating asymptomatic, undetected and total cases for the COVID-19 outbreak in Wuhan: a mathematical modeling studyAbstract Background The COVID-19 outbreak in Wuhan started in December 2019 and was under control by the end of March 2020 with a total of 50,006 confirmed cases by the implementation of a series of nonpharmaceutical interventions (NPIs) including unprecedented lockdown of the city. This study analyzes the complete outbreak data from Wuhan, assesses the impact of these public health interventions, and estimates the asymptomatic, undetected and total cases for the COVID-19 outbreak in Wuhan. Methods By taking different stages of the outbreak into account, we developed a time-dependent compartmental model to describe the dynamics of disease transmission and case detection and reporting. Model coefficients were parameterized by using the reported cases and following key events and escalated control strategies. Then the model was used to calibrate the complete outbreak data by using the Monte Carlo Markov Chain (MCMC) method. Finally we used the model to estimate asymptomatic and undetected cases and approximate the overall antibody prevalence level. Results We found that the transmission rate between Jan 24 and Feb 1, 2020, was twice as large as that before the lockdown on Jan 23 and 67.6 % (95% CI [0.584,0.759]) of detectable infections occurred during this period. Based on themore »