The ability to quickly learn fundamentals about a new infectious disease, such as how it is transmitted, the incubation period, and related symptoms, is crucial in any novel pandemic. For instance, rapid identification of symptoms can enable interventions for dampening the spread of the disease. Traditionally, symptoms are learned from research publications associated with clinical studies. However, clinical studies are often slow and time intensive, and hence delays can have dire consequences in a rapidly spreading pandemic like we have seen with COVID-19. In this article, we introduce SymptomID, a modular artificial intelligence–based framework for rapid identification of symptoms associated with novel pandemics using publicly available news reports. SymptomID is built using the state-of-the-art natural language processing model (Bidirectional Encoder Representations for Transformers) to extract symptoms from publicly available news reports and cluster-related symptoms together to remove redundancy. Our proposed framework requires minimal training data, because it builds on a pre-trained language model. In this study, we present a case study of SymptomID using news articles about the current COVID-19 pandemic. Our COVID-19 symptom extraction module, trained on 225 articles, achieves an F1 score of over 0.8. SymptomID can correctly identify well-established symptoms (e.g., “fever” and “cough”) and less-prevalent symptomsmore »
A semi-parametric, state-space compartmental model with time-dependent parameters for forecasting COVID-19 cases, hospitalizations and deaths
Short-term forecasts of the dynamics of coronavirus disease 2019 (COVID-19) in the period up to its decline following mass vaccination was a task that received much attention but proved difficult to do with high accuracy. However, the availability of standardized forecasts and versioned datasets from this period allows for continued work in this area. Here, we introduce the Gaussian infection state space with time dependence (GISST) forecasting model. We evaluate its performance in one to four weeks ahead forecasts of COVID-19 cases, hospital admissions and deaths in the state of California made with official reports of COVID-19, Google’s mobility reports and vaccination data available each week. Evaluation of these forecasts with a weighted interval score shows them to consistently outperform a naive baseline forecast and often score closer to or better than a high-performing ensemble forecaster. The GISST model also provides parameter estimates for a compartmental model of COVID-19 dynamics, includes a regression submodel for the transmission rate and allows for parameters to vary over time according to a random walk. GISST provides a novel, balanced combination of computational efficiency, model interpretability and applicability to large multivariate datasets that may prove useful in improving the accuracy of infectious disease forecasts.
- Award ID(s):
- Publication Date:
- NSF-PAR ID:
- Journal Name:
- Journal of The Royal Society Interface
- Sponsoring Org:
- National Science Foundation
More Like this
Abstract The rapid rollout of the COVID-19 vaccine raises the question of whether and when the ongoing pandemic could be eliminated with vaccination and non-pharmaceutical interventions (NPIs). Despite advances in the impact of NPIs and the conceptual belief that NPIs and vaccination control COVID-19 infections, we lack evidence to employ control theory in real-world social human dynamics in the context of disease spreading. We bridge the gap by developing a new analytical framework that treats COVID-19 as a feedback control system with the NPIs and vaccination as the controllers and a computational model that maps human social behaviors into input signals. This approach enables us to effectively predict the epidemic spreading in 381 Metropolitan statistical areas (MSAs) in the US by learning our model parameters utilizing the time series NPIs (i.e., the stay-at-home order, face-mask wearing, and testing) data. This model allows us to optimally identify three NPIs to predict infections accurately in 381 MSAs and avoid over-fitting. Our numerical results demonstrate our approach’s excellent predictive power with R 2 > 0.9 for all the MSAs regardless of their sizes, locations, and demographic status. Our methodology allows us to estimate the needed vaccine coverage and NPIs for achieving R e tomore »
Estimating asymptomatic, undetected and total cases for the COVID-19 outbreak in Wuhan: a mathematical modeling studyAbstract Background The COVID-19 outbreak in Wuhan started in December 2019 and was under control by the end of March 2020 with a total of 50,006 confirmed cases by the implementation of a series of nonpharmaceutical interventions (NPIs) including unprecedented lockdown of the city. This study analyzes the complete outbreak data from Wuhan, assesses the impact of these public health interventions, and estimates the asymptomatic, undetected and total cases for the COVID-19 outbreak in Wuhan. Methods By taking different stages of the outbreak into account, we developed a time-dependent compartmental model to describe the dynamics of disease transmission and case detection and reporting. Model coefficients were parameterized by using the reported cases and following key events and escalated control strategies. Then the model was used to calibrate the complete outbreak data by using the Monte Carlo Markov Chain (MCMC) method. Finally we used the model to estimate asymptomatic and undetected cases and approximate the overall antibody prevalence level. Results We found that the transmission rate between Jan 24 and Feb 1, 2020, was twice as large as that before the lockdown on Jan 23 and 67.6 % (95% CI [0.584,0.759]) of detectable infections occurred during this period. Based on themore »
High resolution mobility datasets have become increasingly available in the past few years and have enabled detailed models for infectious disease spread including those for COVID-19. However, there are open questions on how such a mobility data can be used effectively within epidemic models and for which tasks they are best suited. In this paper, we extract a number of graph-based proximity metrics from high resolution cellphone trace data from X-Mode and use it to study COVID-19 epidemic spread in 50 land grant university counties in the US. We present an approach to estimate the effect of mobility on cases by fitting an ODE based model and performing multivariate linear regression to explain the estimated time varying transmissibility. We find that, while mobility plays a significant role, the contribution is heterogeneous across the counties, as exemplified by a subsequent correlation analysis. We subsequently evaluate the metrics’ utility for case surge prediction defined as a supervised classification problem, and show that the learnt model can predict surges with 95% accuracy and 87% F1-score.
Evaluating the Effect of a COVID-19 Predictive Model to Facilitate Discharge: A Randomized Controlled TrialAbstract Background We previously developed and validated a predictive model to help clinicians identify hospitalized adults with coronavirus disease 2019 (COVID-19) who may be ready for discharge given their low risk of adverse events. Whether this algorithm can prompt more timely discharge for stable patients in practice is unknown. Objectives The aim of the study is to estimate the effect of displaying risk scores on length of stay (LOS). Methods We integrated model output into the electronic health record (EHR) at four hospitals in one health system by displaying a green/orange/red score indicating low/moderate/high-risk in a patient list column and a larger COVID-19 summary report visible for each patient. Display of the score was pseudo-randomized 1:1 into intervention and control arms using a patient identifier passed to the model execution code. Intervention effect was assessed by comparing LOS between intervention and control groups. Adverse safety outcomes of death, hospice, and re-presentation were tested separately and as a composite indicator. We tracked adoption and sustained use through daily counts of score displays. Results Enrolling 1,010 patients from May 15, 2020 to December 7, 2020, the trial found no detectable difference in LOS. The intervention had no impact on safety indicators of death, hospice or re-presentationmore »