skip to main content


Title: Examining the COVID-19 case growth rate due to visitor vs. local mobility in the United States using machine learning
Abstract

Travel patterns and mobility affect the spread of infectious diseases like COVID-19. However, we do not know to what extent local vs. visitor mobility affects the growth in the number of cases. This study evaluates the impact of state-level local vs. visitor mobility in understanding the growth with respect to the number of cases for COVID spread in the United States between March 1, 2020, and December 31, 2020. Two metrics, namely local and visitor transmission risk, were extracted from mobility data to capture the transmission potential of COVID-19 through mobility. A combination of the three factors: the current number of cases, local transmission risk, and the visitor transmission risk, are used to model the future number of cases using various machine learning models. The factors that contribute to better forecast performance are the ones that impact the number of cases. The statistical significance of the forecasts is also evaluated using the Diebold–Mariano test. Finally, the performance of models is compared for three waves across all 50 states. The results show that visitor mobility significantly impacts the case growth by improving the prediction accuracy by 33.78%. We also observe that the impact of visitor mobility is more pronounced during the first peak, i.e., March–June 2020.

 
more » « less
Award ID(s):
1650551 2027688 1429526
NSF-PAR ID:
10381820
Author(s) / Creator(s):
; ; ; ;
Publisher / Repository:
Nature Publishing Group
Date Published:
Journal Name:
Scientific Reports
Volume:
12
Issue:
1
ISSN:
2045-2322
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Abstract The objective of this study is to examine the transmission risk of COVID-19 based on cross-county population co-location data from Facebook. The rapid spread of COVID-19 in the United States has imposed a major threat to public health, the real economy, and human well-being. With the absence of effective vaccines, the preventive actions of social distancing, travel reduction and stay-at-home orders are recognized as essential non-pharmacologic approaches to control the infection and spatial spread of COVID-19. Prior studies demonstrated that human movement and mobility drove the spatiotemporal distribution of COVID-19 in China. Little is known, however, about the patterns and effects of co-location reduction on cross-county transmission risk of COVID-19. This study utilizes Facebook co-location data for all counties in the United States from March to early May 2020 for conducting spatial network analysis where nodes represent counties and edge weights are associated with the co-location probability of populations of the counties. The analysis examines the synchronicity and time lag between travel reduction and pandemic growth trajectory to evaluate the efficacy of social distancing in ceasing the population co-location probabilities, and subsequently the growth in weekly new cases across counties. The results show that the mitigation effects of co-location reduction appear in the growth of weekly new confirmed cases with one week of delay. The analysis categorizes counties based on the number of confirmed COVID-19 cases and examines co-location patterns within and across groups. Significant segregation is found among different county groups. The results suggest that within-group co-location probabilities (e.g., co-location probabilities among counties with high numbers of cases) remain stable, and social distancing policies primarily resulted in reduced cross-group co-location probabilities (due to travel reduction from counties with large number of cases to counties with low numbers of cases). These findings could have important practical implications for local governments to inform their intervention measures for monitoring and reducing the spread of COVID-19, as well as for adoption in future pandemics. Public policy, economic forecasting, and epidemic modeling need to account for population co-location patterns in evaluating transmission risk of COVID-19 across counties. 
    more » « less
  2. Abstract

    The spatial distribution of population affects disease transmission, especially when shelter in place orders restrict mobility for a large fraction of the population. The spatial network structure of settlements therefore imposes a fundamental constraint on the spatial distribution of the population through which a communicable disease can spread. In this analysis we use the spatial network structure of lighted development as a proxy for the distribution of ambient population to compare the spatiotemporal evolution of COVID-19 confirmed cases in the USA and China. The Visible Infrared Imaging Radiometer Suite (VIIRS) Day/Night Band sensor on the NASA/NOAA Suomi satellite has been imaging night light at ~ 700 m resolution globally since 2012. Comparisons with sub-kilometer resolution census observations in different countries across different levels of development indicate that night light luminance scales with population density over ~ 3 orders of magnitude. However, VIIRS’ constant ~ 700 m resolution can provide a more detailed representation of population distribution in peri-urban and rural areas where aggregated census blocks lack comparable spatial detail. By varying the low luminance threshold of VIIRS-derived night light, we depict spatial networks of lighted development of varying degrees of connectivity within which populations are distributed. The resulting size distributions of spatial network components (connected clusters of nodes) vary with degree of connectivity, but maintain consistent scaling over a wide range (5 × to 10 × in area & number) of network sizes. At continental scales, spatial network rank-size distributions obtained from VIIRS night light brightness are well-described by power laws with exponents near −2 (slopes near −1) for a wide range of low luminance thresholds. The largest components (104to 105km2) represent spatially contiguous agglomerations of urban, suburban and periurban development, while the smallest components represent isolated rural settlements. Projecting county and city-level numbers of confirmed cases of COVID-19 for the USA and China (respectively) onto the corresponding spatial networks of lighted development allows the spatiotemporal evolution of the epidemic (infection and detection) to be quantified as propagation within networks of varying connectivity. Results for China show rapid nucleation and diffusion in January 2020 followed by rapid decreases in new cases in February. While most of the largest cities in China showed new confirmed cases approaching zero before the end of February, most of these cities also showed distinct second waves of cases in March or April. Whereas new cases in Wuhan did not approach zero until mid-March, as of December 2020 it has not yet experienced a second wave of cases. In contrast, the results for the USA show a wide range of trajectories, with an abrupt transition from slow increases in confirmed cases in a small number of network components in January and February, to rapid geographic dispersion to a larger number of components shortly before mobility reductions occurred in March. Results indicate that while most of the upper tail of the network had been exposed by the end of March, the lower tail of the component size distribution has only shown steep increases since mid-June.

     
    more » « less
  3. Abstract This project is funded by the US National Science Foundation (NSF) through their NSF RAPID program under the title “Modeling Corona Spread Using Big Data Analytics.” The project is a joint effort between the Department of Computer & Electrical Engineering and Computer Science at FAU and a research group from LexisNexis Risk Solutions. The novel coronavirus Covid-19 originated in China in early December 2019 and has rapidly spread to many countries around the globe, with the number of confirmed cases increasing every day. Covid-19 is officially a pandemic. It is a novel infection with serious clinical manifestations, including death, and it has reached at least 124 countries and territories. Although the ultimate course and impact of Covid-19 are uncertain, it is not merely possible but likely that the disease will produce enough severe illness to overwhelm the worldwide health care infrastructure. Emerging viral pandemics can place extraordinary and sustained demands on public health and health systems and on providers of essential community services. Modeling the Covid-19 pandemic spread is challenging. But there are data that can be used to project resource demands. Estimates of the reproductive number (R) of SARS-CoV-2 show that at the beginning of the epidemic, each infected person spreads the virus to at least two others, on average (Emanuel et al. in N Engl J Med. 2020, Livingston and Bucher in JAMA 323(14):1335, 2020). A conservatively low estimate is that 5 % of the population could become infected within 3 months. Preliminary data from China and Italy regarding the distribution of case severity and fatality vary widely (Wu and McGoogan in JAMA 323(13):1239–42, 2020). A recent large-scale analysis from China suggests that 80 % of those infected either are asymptomatic or have mild symptoms; a finding that implies that demand for advanced medical services might apply to only 20 % of the total infected. Of patients infected with Covid-19, about 15 % have severe illness and 5 % have critical illness (Emanuel et al. in N Engl J Med. 2020). Overall, mortality ranges from 0.25 % to as high as 3.0 % (Emanuel et al. in N Engl J Med. 2020, Wilson et al. in Emerg Infect Dis 26(6):1339, 2020). Case fatality rates are much higher for vulnerable populations, such as persons over the age of 80 years (> 14 %) and those with coexisting conditions (10 % for those with cardiovascular disease and 7 % for those with diabetes) (Emanuel et al. in N Engl J Med. 2020). Overall, Covid-19 is substantially deadlier than seasonal influenza, which has a mortality of roughly 0.1 %. Public health efforts depend heavily on predicting how diseases such as those caused by Covid-19 spread across the globe. During the early days of a new outbreak, when reliable data are still scarce, researchers turn to mathematical models that can predict where people who could be infected are going and how likely they are to bring the disease with them. These computational methods use known statistical equations that calculate the probability of individuals transmitting the illness. Modern computational power allows these models to quickly incorporate multiple inputs, such as a given disease’s ability to pass from person to person and the movement patterns of potentially infected people traveling by air and land. This process sometimes involves making assumptions about unknown factors, such as an individual’s exact travel pattern. By plugging in different possible versions of each input, however, researchers can update the models as new information becomes available and compare their results to observed patterns for the illness. In this paper we describe the development a model of Corona spread by using innovative big data analytics techniques and tools. We leveraged our experience from research in modeling Ebola spread (Shaw et al. Modeling Ebola Spread and Using HPCC/KEL System. In: Big Data Technologies and Applications 2016 (pp. 347-385). Springer, Cham) to successfully model Corona spread, we will obtain new results, and help in reducing the number of Corona patients. We closely collaborated with LexisNexis, which is a leading US data analytics company and a member of our NSF I/UCRC for Advanced Knowledge Enablement. The lack of a comprehensive view and informative analysis of the status of the pandemic can also cause panic and instability within society. Our work proposes the HPCC Systems Covid-19 tracker, which provides a multi-level view of the pandemic with the informative virus spreading indicators in a timely manner. The system embeds a classical epidemiological model known as SIR and spreading indicators based on causal model. The data solution of the tracker is built on top of the Big Data processing platform HPCC Systems, from ingesting and tracking of various data sources to fast delivery of the data to the public. The HPCC Systems Covid-19 tracker presents the Covid-19 data on a daily, weekly, and cumulative basis up to global-level and down to the county-level. It also provides statistical analysis for each level such as new cases per 100,000 population. The primary analysis such as Contagion Risk and Infection State is based on causal model with a seven-day sliding window. Our work has been released as a publicly available website to the world and attracted a great volume of traffic. The project is open-sourced and available on GitHub. The system was developed on the LexisNexis HPCC Systems, which is briefly described in the paper. 
    more » « less
  4. Wu, Joseph T. (Ed.)
    Colombia announced the first case of severe acute respiratory syndrome coronavirus 2 on March 6, 2020. Since then, the country has reported a total of 5,002,387 cases and 127,258 deaths as of October 31, 2021. The aggressive transmission dynamics of SARS-CoV-2 motivate an investigation of COVID-19 at the national and regional levels in Colombia. We utilize the case incidence and mortality data to estimate the transmission potential and generate short-term forecasts of the COVID-19 pandemic to inform the public health policies using previously validated mathematical models. The analysis is augmented by the examination of geographic heterogeneity of COVID-19 at the departmental level along with the investigation of mobility and social media trends. Overall, the national and regional reproduction numbers show sustained disease transmission during the early phase of the pandemic, exhibiting sub-exponential growth dynamics. Whereas the most recent estimates of reproduction number indicate disease containment, with R t <1.0 as of October 31, 2021. On the forecasting front, the sub-epidemic model performs best at capturing the 30-day ahead COVID-19 trajectory compared to the Richards and generalized logistic growth model. Nevertheless, the spatial variability in the incidence rate patterns across different departments can be grouped into four distinct clusters. As the case incidence surged in July 2020, an increase in mobility patterns was also observed. On the contrary, a spike in the number of tweets indicating the stay-at-home orders was observed in November 2020 when the case incidence had already plateaued, indicating the pandemic fatigue in the country. 
    more » « less
  5. Abstract Background

    The Mexican Institute of Social Security (IMSS) is the largest health care provider in Mexico, covering about 48% of the Mexican population. In this report, we describe the epidemiological patterns related to confirmed cases, hospitalizations, intubations, and in-hospital mortality due to COVID-19 and associated factors, during five epidemic waves recorded in the IMSS surveillance system.

    Methods

    We analyzed COVID-19 laboratory-confirmed cases from the Online Epidemiological Surveillance System (SINOLAVE) from March 29th, 2020, to August 27th, 2022. We constructed weekly epidemic curves describing temporal patterns of confirmed cases and hospitalizations by age, gender, and wave. We also estimated hospitalization, intubation, and hospital case fatality rates. The mean days of in-hospital stay and hospital admission delay were calculated across five pandemic waves. Logistic regression models were employed to assess the association between demographic factors, comorbidities, wave, and vaccination and the risk of severe disease and in-hospital death.

    Results

    A total of 3,396,375 laboratory-confirmed COVID-19 cases were recorded across the five waves. The introduction of rapid antigen testing at the end of 2020 increased detection and modified epidemiological estimates. Overall, 11% (95% CI 10.9, 11.1) of confirmed cases were hospitalized, 20.6% (95% CI 20.5, 20.7) of the hospitalized cases were intubated, and the hospital case fatality rate was 45.1% (95% CI 44.9, 45.3). The mean in-hospital stay was 9.11 days, and patients were admitted on average 5.07 days after symptoms onset. The most recent waves dominated by the Omicron variant had the highest incidence. Hospitalization, intubation, and mean hospitalization days decreased during subsequent waves. The in-hospital case fatality rate fluctuated across waves, reaching its highest value during the second wave in winter 2020. A notable decrease in hospitalization was observed primarily among individuals ≥ 60 years. The risk of severe disease and death was positively associated with comorbidities, age, and male gender; and declined with later waves and vaccination status.

    Conclusion

    During the five pandemic waves, we observed an increase in the number of cases and a reduction in severity metrics. During the first three waves, the high in-hospital fatality rate was associated with hospitalization practices for critical patients with comorbidities.

     
    more » « less