skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Modeling and tracking Covid-19 cases using Big Data analytics on HPCC system platform
Abstract This project is funded by the US National Science Foundation (NSF) through their NSF RAPID program under the title “Modeling Corona Spread Using Big Data Analytics.” The project is a joint effort between the Department of Computer & Electrical Engineering and Computer Science at FAU and a research group from LexisNexis Risk Solutions. The novel coronavirus Covid-19 originated in China in early December 2019 and has rapidly spread to many countries around the globe, with the number of confirmed cases increasing every day. Covid-19 is officially a pandemic. It is a novel infection with serious clinical manifestations, including death, and it has reached at least 124 countries and territories. Although the ultimate course and impact of Covid-19 are uncertain, it is not merely possible but likely that the disease will produce enough severe illness to overwhelm the worldwide health care infrastructure. Emerging viral pandemics can place extraordinary and sustained demands on public health and health systems and on providers of essential community services. Modeling the Covid-19 pandemic spread is challenging. But there are data that can be used to project resource demands. Estimates of the reproductive number (R) of SARS-CoV-2 show that at the beginning of the epidemic, each infected person spreads the virus to at least two others, on average (Emanuel et al. in N Engl J Med. 2020, Livingston and Bucher in JAMA 323(14):1335, 2020). A conservatively low estimate is that 5 % of the population could become infected within 3 months. Preliminary data from China and Italy regarding the distribution of case severity and fatality vary widely (Wu and McGoogan in JAMA 323(13):1239–42, 2020). A recent large-scale analysis from China suggests that 80 % of those infected either are asymptomatic or have mild symptoms; a finding that implies that demand for advanced medical services might apply to only 20 % of the total infected. Of patients infected with Covid-19, about 15 % have severe illness and 5 % have critical illness (Emanuel et al. in N Engl J Med. 2020). Overall, mortality ranges from 0.25 % to as high as 3.0 % (Emanuel et al. in N Engl J Med. 2020, Wilson et al. in Emerg Infect Dis 26(6):1339, 2020). Case fatality rates are much higher for vulnerable populations, such as persons over the age of 80 years (> 14 %) and those with coexisting conditions (10 % for those with cardiovascular disease and 7 % for those with diabetes) (Emanuel et al. in N Engl J Med. 2020). Overall, Covid-19 is substantially deadlier than seasonal influenza, which has a mortality of roughly 0.1 %. Public health efforts depend heavily on predicting how diseases such as those caused by Covid-19 spread across the globe. During the early days of a new outbreak, when reliable data are still scarce, researchers turn to mathematical models that can predict where people who could be infected are going and how likely they are to bring the disease with them. These computational methods use known statistical equations that calculate the probability of individuals transmitting the illness. Modern computational power allows these models to quickly incorporate multiple inputs, such as a given disease’s ability to pass from person to person and the movement patterns of potentially infected people traveling by air and land. This process sometimes involves making assumptions about unknown factors, such as an individual’s exact travel pattern. By plugging in different possible versions of each input, however, researchers can update the models as new information becomes available and compare their results to observed patterns for the illness. In this paper we describe the development a model of Corona spread by using innovative big data analytics techniques and tools. We leveraged our experience from research in modeling Ebola spread (Shaw et al. Modeling Ebola Spread and Using HPCC/KEL System. In: Big Data Technologies and Applications 2016 (pp. 347-385). Springer, Cham) to successfully model Corona spread, we will obtain new results, and help in reducing the number of Corona patients. We closely collaborated with LexisNexis, which is a leading US data analytics company and a member of our NSF I/UCRC for Advanced Knowledge Enablement. The lack of a comprehensive view and informative analysis of the status of the pandemic can also cause panic and instability within society. Our work proposes the HPCC Systems Covid-19 tracker, which provides a multi-level view of the pandemic with the informative virus spreading indicators in a timely manner. The system embeds a classical epidemiological model known as SIR and spreading indicators based on causal model. The data solution of the tracker is built on top of the Big Data processing platform HPCC Systems, from ingesting and tracking of various data sources to fast delivery of the data to the public. The HPCC Systems Covid-19 tracker presents the Covid-19 data on a daily, weekly, and cumulative basis up to global-level and down to the county-level. It also provides statistical analysis for each level such as new cases per 100,000 population. The primary analysis such as Contagion Risk and Infection State is based on causal model with a seven-day sliding window. Our work has been released as a publicly available website to the world and attracted a great volume of traffic. The project is open-sourced and available on GitHub. The system was developed on the LexisNexis HPCC Systems, which is briefly described in the paper.  more » « less
Award ID(s):
2027890
PAR ID:
10353138
Author(s) / Creator(s):
; ; ; ; ; ;
Date Published:
Journal Name:
Journal of Big Data
Volume:
8
Issue:
1
ISSN:
2196-1115
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Lischka, A. E.; Dyer, E. B.; Jones, R. S.; Lovett, J. N.; Strayer, J.; & Drown, S. (Ed.)
    Many higher education institutions in the United States provide mathematics tutoring services for undergraduate students. These informal learning experiences generally result in increased final course grades (Byerly & Rickard, 2018; Rickard & Mills, 2018; Xu et al., 2014) and improved student attitudes toward mathematics (Bressoud et al., 2015). In recent years, research has explored the beliefs and practices of undergraduate and, sometimes graduate, peer tutors, both prior to (Bjorkman, 2018; Johns, 2019; Pilgrim et al., 2020) and during the COVID19 pandemic (Gyampoh et al., 2020; Mullen et al., 2021; Van Maaren et al., 2021). Additionally, Burks and James (2019) proposed a framework for Mathematical Knowledge for Tutoring Undergraduate Mathematics adapted from Ball et al. (2008) Mathematical Knowledge for Teaching, highlighting the distinction between tutor and teacher. The current study builds on this body of work on tutors’ beliefs by focusing on mathematical sciences graduate teaching assistants (GTAs) who tutored in an online setting during the 2020-2021 academic year due to the COVID-19 pandemic. Specifically, this study addresses the following research question: What were the mathematical teaching beliefs and practices of graduate student tutors participating in online tutoring sessions through the mathematics learning center (MLC) during the COVID-19 pandemic? 
    more » « less
  2. Alam, Mumtaz (Ed.)
    When COVID-19 was first introduced to the United States, state and local governments enacted a variety of policies intended to mitigate the virulence of the epidemic. At the time, the most effective measures to prevent the spread of COVID-19 included stay-at-home orders, closing of nonessential businesses, and mask mandates. Although it was well known that regions with high population density and cold climates were at the highest risk for disease spread, rural counties that are economically reliant on tourism were incentivized to enact fewer precautions against COVID-19. The uncertainty of the COVID-19 pandemic, the multiple policies to reduce transmission, and the changes in outdoor recreation behavior had a significant impact on rural tourism destinations and management of protected spaces. We utilize fine-scale incidence and demographic data to study the relationship between local economic and political concerns, COVID-19 mitigation measures, and the subsequent severity of outbreaks throughout the continental United States. We also present results from an online survey that measured travel behavior, health risk perceptions, knowledge and experience with COVID-19, and evaluation of destination attributes by 407 out-of-state visitors who traveled to Maine from 2020 to 2021. We synthesize this research to present a narrative on how perceptions of COVID-19 risk and public perceptions of rural tourism put certain communities at greater risk of illness throughout 2020. This research could inform future rural destination management and public health policies to help reduce negative socioeconomic, health and environmental impacts of pandemic-derived changes in travel and outdoor recreation behavior. 
    more » « less
  3. null (Ed.)
    Situational awareness provides the decision making capability to identify, process, and comprehend big data. In our approach, situational awareness is achieved by integrating and analyzing multiple aspects of data using stacked bar graphs and geographic representations of the data. We provide a data visualization tool to represent COVID pandemic data on top of the geographical information. The combination of geospatial and temporal data provides the information needed to conduct situational analysis for the COVID-19 pandemic. By providing interactivity, geographical maps can be viewed from different perspectives and offer insight into the dynamical aspects of the COVID-19 pandemic for the fifty states in the USA. We have overlaid dynamic information on top of a geographical representation in an intuitive way for decision making. We describe how modeling and simulation of data increase situational awareness, especially when coupled with immersive virtual reality interaction. This paper presents an immersive virtual reality (VR) environment and mobile environment for data visualization using Oculus Rift head-mounted display and smartphones. This work combines neural network predictions with human-centric situational awareness and data analytics to provide accurate, timely, and scientific strategies in combatting and mitigating the spread of the coronavirus pandemic. Testing and evaluation of the data visualization tool have been done with real-time feed of COVID pandemic data set for immersive environment, non-immersive environment, and mobile environment. 
    more » « less
  4. An outbreak of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), epi-centred in Hubei Province of the People’s Republic of China, quickly spread worldwide and caused COVID-19 pandemic. It infected hundreds of millions of people and caused millions of deaths. In this paper, we develop a compartmental ODE model of COVID-19 transmission. We consider a possibility of breakthrough infections after the vaccination and account for both symptomatic and asymptomatic infections and transmissions. We also incorporate game theory to study the optimal vaccination decisions from the individuals’ perspective. We show that vaccination alone is unlikely to eliminate COVID-19. To achieve herd immunity, the individuals would have to receive a dose of a vaccine more frequently than once every 3 months. It is therefore crucial to adhere to various guidelines, such as quarantine, isolate and wear a mask if tested positive for COVID-19. 
    more » « less
  5. null (Ed.)
    The sudden outbreak of the COVID-19 pandemic has brought drastic changes to people’s daily lives, work, and the surrounding environment. Investigations into these changes are very important for decision makers to implement policies on economic loss assessments and stimulation packages, city reopening, resilience of the environment, and arrangement of medical resources. In order to analyze the impact of COVID-19 on people’s lives, activities, and the natural environment, this paper investigates the spatial and temporal characteristics of Nighttime Light (NTL) radiance and Air Quality Index (AQI) before and during the pandemic in mainland China. The monthly mean NTL radiance, and daily and monthly mean AQI are calculated over mainland China and compared before and during the pandemic. Our results show that the monthly average NTL brightness is much lower during the quarantine period than before. This study categorizes NTL into three classes: residential area, transportation, and public facilities and commercial centers, with NTL radiance ranges of 5–20, 20–40 and greater than 40 (nW· cm − 2 · sr − 1 ), respectively. We found that the Number of Pixels (NOP) with NTL detection increased in the residential area and decreased in the commercial centers for most of the provinces after the shutdown, while transportation and public facilities generally stayed the same. More specifically, we examined these factors in Wuhan, where the first confirmed cases were reported, and where the earliest quarantine measures were taken. Observations and analysis of pixels associated with commercial centers were observed to have lower NTL radiance values, indicating a dimming behavior, while residential area pixels recorded increased levels of brightness after the beginning of the lockdown. The study also discovered a significant decreasing trend in the daily average AQI for mainland China from January to March 2020, with cleaner air in most provinces during February and March, compared to January 2020. In conclusion, the outbreak and spread of COVID-19 has had a crucial impact on people’s daily lives and activity ranges through the increased implementation of lockdown and quarantine policies. On the other hand, the air quality of mainland China has improved with the reduction in non-essential industries and motor vehicle usage. This evidence demonstrates that the Chinese government has executed very stringent quarantine policies to deal with the pandemic. The decisive response to control the spread of COVID-19 provides a reference for other parts of the world. 
    more » « less