skip to main content

This content will become publicly available on December 1, 2023

Title: Analysis of individual-level data from 2018–2020 Ebola outbreak in Democratic Republic of the Congo
Abstract The 2018–2020 Ebola virus disease epidemic in Democratic Republic of the Congo (DRC) resulted in 3481 cases (probable and confirmed) and 2299 deaths. In this paper, we use a novel statistical method to analyze the individual-level incidence and hospitalization data on DRC Ebola victims. Our analysis suggests that an increase in the rate of quarantine and isolation that has shortened the infectiousness period by approximately one day during the epidemic’s third and final wave was likely responsible for the eventual containment of the outbreak. The analysis further reveals that the total effective population size or the average number of individuals at risk for the disease exposure in three epidemic waves over the period of 24 months was around 16,000–a much smaller number than previously estimated and likely an evidence of at least partial protection of the population at risk through ring vaccination and contact tracing as well as adherence to strict quarantine and isolation policies.
; ; ; ; ;
Award ID(s):
Publication Date:
Journal Name:
Scientific Reports
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract This project is funded by the US National Science Foundation (NSF) through their NSF RAPID program under the title “Modeling Corona Spread Using Big Data Analytics.” The project is a joint effort between the Department of Computer & Electrical Engineering and Computer Science at FAU and a research group from LexisNexis Risk Solutions. The novel coronavirus Covid-19 originated in China in early December 2019 and has rapidly spread to many countries around the globe, with the number of confirmed cases increasing every day. Covid-19 is officially a pandemic. It is a novel infection with serious clinical manifestations, including death, and it has reached at least 124 countries and territories. Although the ultimate course and impact of Covid-19 are uncertain, it is not merely possible but likely that the disease will produce enough severe illness to overwhelm the worldwide health care infrastructure. Emerging viral pandemics can place extraordinary and sustained demands on public health and health systems and on providers of essential community services. Modeling the Covid-19 pandemic spread is challenging. But there are data that can be used to project resource demands. Estimates of the reproductive number (R) of SARS-CoV-2 show that at the beginning of the epidemic, each infectedmore »person spreads the virus to at least two others, on average (Emanuel et al. in N Engl J Med. 2020, Livingston and Bucher in JAMA 323(14):1335, 2020). A conservatively low estimate is that 5 % of the population could become infected within 3 months. Preliminary data from China and Italy regarding the distribution of case severity and fatality vary widely (Wu and McGoogan in JAMA 323(13):1239–42, 2020). A recent large-scale analysis from China suggests that 80 % of those infected either are asymptomatic or have mild symptoms; a finding that implies that demand for advanced medical services might apply to only 20 % of the total infected. Of patients infected with Covid-19, about 15 % have severe illness and 5 % have critical illness (Emanuel et al. in N Engl J Med. 2020). Overall, mortality ranges from 0.25 % to as high as 3.0 % (Emanuel et al. in N Engl J Med. 2020, Wilson et al. in Emerg Infect Dis 26(6):1339, 2020). Case fatality rates are much higher for vulnerable populations, such as persons over the age of 80 years (> 14 %) and those with coexisting conditions (10 % for those with cardiovascular disease and 7 % for those with diabetes) (Emanuel et al. in N Engl J Med. 2020). Overall, Covid-19 is substantially deadlier than seasonal influenza, which has a mortality of roughly 0.1 %. Public health efforts depend heavily on predicting how diseases such as those caused by Covid-19 spread across the globe. During the early days of a new outbreak, when reliable data are still scarce, researchers turn to mathematical models that can predict where people who could be infected are going and how likely they are to bring the disease with them. These computational methods use known statistical equations that calculate the probability of individuals transmitting the illness. Modern computational power allows these models to quickly incorporate multiple inputs, such as a given disease’s ability to pass from person to person and the movement patterns of potentially infected people traveling by air and land. This process sometimes involves making assumptions about unknown factors, such as an individual’s exact travel pattern. By plugging in different possible versions of each input, however, researchers can update the models as new information becomes available and compare their results to observed patterns for the illness. In this paper we describe the development a model of Corona spread by using innovative big data analytics techniques and tools. We leveraged our experience from research in modeling Ebola spread (Shaw et al. Modeling Ebola Spread and Using HPCC/KEL System. In: Big Data Technologies and Applications 2016 (pp. 347-385). Springer, Cham) to successfully model Corona spread, we will obtain new results, and help in reducing the number of Corona patients. We closely collaborated with LexisNexis, which is a leading US data analytics company and a member of our NSF I/UCRC for Advanced Knowledge Enablement. The lack of a comprehensive view and informative analysis of the status of the pandemic can also cause panic and instability within society. Our work proposes the HPCC Systems Covid-19 tracker, which provides a multi-level view of the pandemic with the informative virus spreading indicators in a timely manner. The system embeds a classical epidemiological model known as SIR and spreading indicators based on causal model. The data solution of the tracker is built on top of the Big Data processing platform HPCC Systems, from ingesting and tracking of various data sources to fast delivery of the data to the public. The HPCC Systems Covid-19 tracker presents the Covid-19 data on a daily, weekly, and cumulative basis up to global-level and down to the county-level. It also provides statistical analysis for each level such as new cases per 100,000 population. The primary analysis such as Contagion Risk and Infection State is based on causal model with a seven-day sliding window. Our work has been released as a publicly available website to the world and attracted a great volume of traffic. The project is open-sourced and available on GitHub. The system was developed on the LexisNexis HPCC Systems, which is briefly described in the paper.« less
  2. Efficient contact tracing and isolation is an effective strategy to control epidemics. It was used effectively during the Ebola epidemic and successfully implemented in several parts of the world during the ongoing COVID-19 pandemic. An important consideration in contact tracing is the budget on the number of individuals asked to quarantine -- the budget is limited for socioeconomic reasons. In this paper, we present a Markov Decision Process (MDP) framework to formulate the problem of using contact tracing to reduce the size of an outbreak while asking a limited number of people to quarantine. We formulate each step of the MDP as a combinatorial problem, MinExposed, which we demonstrate is NP-Hard; as a result, we develop an LP-based approximation algorithm. Though this algorithm directly solves MinExposed, it is often impractical in the real world due to information constraints. To this end, we develop a greedy approach based on insights from the analysis of the previous algorithm, which we show is more interpretable. A key feature of the greedy algorithm is that it does not need complete information of the underlying social contact network. This makes the heuristic implementable in practice and is an important consideration. Finally, we carry out experimentsmore »on simulations of the MDP run on real-world networks, and show how the algorithms can help in bending the epidemic curve while limiting the number of isolated individuals. Our experimental results demonstrate that the greedy algorithm and its variants are especially effective, robust, and practical in a variety of realistic scenarios, such as when the contact graph and specific transmission probabilities are not known. All code can be found in our GitHub repository: this https URL.« less
  3. Abstract We employ individual-based Monte Carlo computer simulations of a stochastic SEIR model variant on a two-dimensional Newman–Watts small-world network to investigate the control of epidemic outbreaks through periodic testing and isolation of infectious individuals, and subsequent quarantine of their immediate contacts. Using disease parameters informed by the COVID-19 pandemic, we investigate the effects of various crucial mitigation features on the epidemic spreading: fraction of the infectious population that is identifiable through the tests; testing frequency; time delay between testing and isolation of positively tested individuals; and the further time delay until quarantining their contacts as well as the quarantine duration. We thus determine the required ranges for these intervention parameters to yield effective control of the disease through both considerable delaying the epidemic peak and massively reducing the total number of sustained infections.
  4. Althouse, Benjamin Muir (Ed.)
    Disease epidemic outbreaks on human metapopulation networks are often driven by a small number of superspreader nodes, which are primarily responsible for spreading the disease throughout the network. Superspreader nodes typically are characterized either by their locations within the network, by their degree of connectivity and centrality, or by their habitat suitability for the disease, described by their reproduction number ( R ). Here we introduce a model that considers simultaneously the effects of network properties and R on superspreaders, as opposed to previous research which considered each factor separately. This type of model is applicable to diseases for which habitat suitability varies by climate or land cover, and for direct transmitted diseases for which population density and mitigation practices influences R . We present analytical models that quantify the superspreader capacity of a population node by two measures: probability-dependent superspreader capacity, the expected number of neighboring nodes to which the node in consideration will randomly spread the disease per epidemic generation, and time-dependent superspreader capacity, the rate at which the node spreads the disease to each of its neighbors. We validate our analytical models with a Monte Carlo analysis of repeated stochastic Susceptible-Infected-Recovered (SIR) simulations on randomly generated humanmore »population networks, and we use a random forest statistical model to relate superspreader risk to connectivity, R , centrality, clustering, and diffusion. We demonstrate that either degree of connectivity or R above a certain threshold are sufficient conditions for a node to have a moderate superspreader risk factor, but both are necessary for a node to have a high-risk factor. The statistical model presented in this article can be used to predict the location of superspreader events in future epidemics, and to predict the effectiveness of mitigation strategies that seek to reduce the value of R , alter host movements, or both.« less
  5. Borri, Alessandro (Ed.)
    Ever since the outbreak of the COVID-19 epidemic, various public health control strategies have been proposed and tested against the coronavirus SARS-CoV-2. We study three specific COVID-19 epidemic control models: the susceptible, exposed, infectious, recovered (SEIR) model with vaccination control; the SEIR model with shield immunity control; and the susceptible, un-quarantined infected, quarantined infected, confirmed infected (SUQC) model with quarantine control. We express the control requirement in metric temporal logic (MTL) formulas (a type of formal specification languages) which can specify the expected control outcomes such as “ the deaths from the infection should never exceed one thousand per day within the next three months ” or “ the population immune from the disease should eventually exceed 200 thousand within the next 100 to 120 days ”. We then develop methods for synthesizing control strategies with MTL specifications. To the best of our knowledge, this is the first paper to systematically synthesize control strategies based on the COVID-19 epidemic models with formal specifications. We provide simulation results in three different case studies: vaccination control for the COVID-19 epidemic with model parameters estimated from data in Lombardy, Italy; shield immunity control for the COVID-19 epidemic with model parameters estimated from datamore »in Lombardy, Italy; and quarantine control for the COVID-19 epidemic with model parameters estimated from data in Wuhan, China. The results show that the proposed synthesis approach can generate control inputs such that the time-varying numbers of individuals in each category (e.g., infectious, immune) satisfy the MTL specifications. The results also show that early intervention is essential in mitigating the spread of COVID-19, and more control effort is needed for more stringent MTL specifications. For example, based on the model in Lombardy, Italy, achieving less than 100 deaths per day and 10000 total deaths within 100 days requires 441.7% more vaccination control effort than achieving less than 1000 deaths per day and 50000 total deaths within 100 days.« less