skip to main content


Title: From 5Vs to 6Cs: Operationalizing Epidemic Data Management with COVID-19 Surveillance
Abstract—The COVID-19 pandemic brought to the forefront an unprecedented need for experts, as well as citizens, to visualize spatio-temporal disease surveillance data. Web application dashboards were quickly developed to fill t his g ap, b ut a ll of these dashboards supported a particular niche view of the pandemic (ie, current status or specific r egions). I n t his paper, we describe our work developing our COVID-19 Surveillance Dashboard, which offers a unique view of the pandemic while also allowing users to focus on the details that interest them. From the beginning, our goal was to provide a simple visual tool for comparing, organizing, and tracking near-real-time surveillance data as the pandemic progresses. In developing this dashboard, we also identified 6 key metrics which we propose as a standard for the design and evaluation of real-time epidemic science dashboards. Our dashboard was one of the first r eleased t o the public, and continues to be actively visited. Our own group uses it to support federal, state and local public health authorities, and it is used by individuals worldwide to track the evolution of the COVID-19 pandemic, build their own dashboards, and support their organizations as they plan their responses to the pandemic.  more » « less
Award ID(s):
1633028 1443054 1916805 1918656 2028004 2027541
NSF-PAR ID:
10403995
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ; ;
Date Published:
Journal Name:
2020 IEEE International Conference on Big Data (Big Data)
Page Range / eLocation ID:
1380 to 1387
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    The COVID-19 pandemic brought to the forefront an unprecedented need for experts, as well as citizens, to visualize spatio-temporal disease surveillance data. Web application dashboards were quickly developed to fill this gap, including those built by JHU, WHO, and CDC, but all of these dashboards supported a particular niche view of the pandemic (ie, current status or specific regions). In this paper1, we describe our work developing our own COVID-19 Surveillance Dashboard, available at https://nssac.bii.virginia.edu/covid19/dashboard/, which offers a universal view of the pandemic while also allowing users to focus on the details that interest them. From the beginning, our goal was to provide a simple visual way to compare, organize, and track near-real-time surveillance data as the pandemic progresses. Our dashboard includes a number of advanced features for zooming, filtering, categorizing and visualizing multiple time series on a single canvas. In developing this dashboard, we have also identified 6 key metrics we call the 6Cs standard which we propose as a standard for the design and evaluation of real-time epidemic science dashboards. Our dashboard was one of the first released to the public, and remains one of the most visited and highly used. Our group uses it to support federal, state and local public health authorities, and it is used by people worldwide to track the pandemic evolution, build their own dashboards, and support their organizations as they plan their responses to the pandemic. We illustrate the utility of our dashboard by describing how it can be used to support data story-telling – an important emerging area in data science. 
    more » « less
  2. During COVID-19, countless dashboards have served as central media where people learn critical information about the pandemic. Varied actors, including news organizations, government agencies, universities, and NGOs created and maintained these dashboards, conducting the onerous labor of collecting, categorizing, and taking care of COVID data. This study uncovers different forms of data practices and labor behind the building of these dashboards, based on in-depth interviews with volunteers and practitioners across India and the United States who have participated in COVID dashboard projects.Specifically, we are interested in projects that have focused on underrepresented or missing COVID data such as COVID cases in prisons and long-term care facilities, racial/ethnic breakdown of cases, as well as deaths due to COVID enforcement. These data builders employed sometimes creative, sometimes mundane and laborious data practices to not simply collect, but to produce these data that are often invisible in the official COVID dataset. In this process of data production, dashboard builders grappled with the questions of how certain data is collected, who/what is missing from the dataset, and how these data voids shape and manipulate our understanding of the pandemic. Interviewing 74 data builders who participated in COVID dashboard projects, this paper demonstrates the range of underrepresented and messy COVID data that these data builders have identified, fixed, and maintained to render them useful: disappearing data, lumped data, and absent data. Such critical engagement with messy COVID data reveals different data injustices that have tremendous potential to affect future pandemic preparation and management.

     
    more » « less
  3. Abstract

    A year since the declaration of the global coronavirus disease 2019 (COVID-19) pandemic, there were over 110 million cases and 2.5 million deaths. Learning from methods to track community spread of other viruses such as poliovirus, environmental virologists and those in the wastewater-based epidemiology (WBE) field quickly adapted their existing methods to detect SARS-CoV-2 RNA in wastewater. Unlike COVID-19 case and mortality data, there was not a global dashboard to track wastewater monitoring of SARS-CoV-2 RNA worldwide. This study provides a 1-year review of the “COVIDPoops19” global dashboard of universities, sites, and countries monitoring SARS-CoV-2 RNA in wastewater. Methods to assemble the dashboard combined standard literature review, Google Form submissions, and daily, social media keyword searches. Over 200 universities, 1400 sites, and 55 countries with 59 dashboards monitored wastewater for SARS-CoV-2 RNA. However, monitoring was primarily in high-income countries (65%) with less access to this valuable tool in low- and middle-income countries (35%). Data were not widely shared publicly or accessible to researchers to further inform public health actions, perform meta-analysis, better coordinate, and determine equitable distribution of monitoring sites. For WBE to be used to its full potential during COVID-19 and beyond, show us the data.

     
    more » « less
  4. Abstract This project is funded by the US National Science Foundation (NSF) through their NSF RAPID program under the title “Modeling Corona Spread Using Big Data Analytics.” The project is a joint effort between the Department of Computer & Electrical Engineering and Computer Science at FAU and a research group from LexisNexis Risk Solutions. The novel coronavirus Covid-19 originated in China in early December 2019 and has rapidly spread to many countries around the globe, with the number of confirmed cases increasing every day. Covid-19 is officially a pandemic. It is a novel infection with serious clinical manifestations, including death, and it has reached at least 124 countries and territories. Although the ultimate course and impact of Covid-19 are uncertain, it is not merely possible but likely that the disease will produce enough severe illness to overwhelm the worldwide health care infrastructure. Emerging viral pandemics can place extraordinary and sustained demands on public health and health systems and on providers of essential community services. Modeling the Covid-19 pandemic spread is challenging. But there are data that can be used to project resource demands. Estimates of the reproductive number (R) of SARS-CoV-2 show that at the beginning of the epidemic, each infected person spreads the virus to at least two others, on average (Emanuel et al. in N Engl J Med. 2020, Livingston and Bucher in JAMA 323(14):1335, 2020). A conservatively low estimate is that 5 % of the population could become infected within 3 months. Preliminary data from China and Italy regarding the distribution of case severity and fatality vary widely (Wu and McGoogan in JAMA 323(13):1239–42, 2020). A recent large-scale analysis from China suggests that 80 % of those infected either are asymptomatic or have mild symptoms; a finding that implies that demand for advanced medical services might apply to only 20 % of the total infected. Of patients infected with Covid-19, about 15 % have severe illness and 5 % have critical illness (Emanuel et al. in N Engl J Med. 2020). Overall, mortality ranges from 0.25 % to as high as 3.0 % (Emanuel et al. in N Engl J Med. 2020, Wilson et al. in Emerg Infect Dis 26(6):1339, 2020). Case fatality rates are much higher for vulnerable populations, such as persons over the age of 80 years (> 14 %) and those with coexisting conditions (10 % for those with cardiovascular disease and 7 % for those with diabetes) (Emanuel et al. in N Engl J Med. 2020). Overall, Covid-19 is substantially deadlier than seasonal influenza, which has a mortality of roughly 0.1 %. Public health efforts depend heavily on predicting how diseases such as those caused by Covid-19 spread across the globe. During the early days of a new outbreak, when reliable data are still scarce, researchers turn to mathematical models that can predict where people who could be infected are going and how likely they are to bring the disease with them. These computational methods use known statistical equations that calculate the probability of individuals transmitting the illness. Modern computational power allows these models to quickly incorporate multiple inputs, such as a given disease’s ability to pass from person to person and the movement patterns of potentially infected people traveling by air and land. This process sometimes involves making assumptions about unknown factors, such as an individual’s exact travel pattern. By plugging in different possible versions of each input, however, researchers can update the models as new information becomes available and compare their results to observed patterns for the illness. In this paper we describe the development a model of Corona spread by using innovative big data analytics techniques and tools. We leveraged our experience from research in modeling Ebola spread (Shaw et al. Modeling Ebola Spread and Using HPCC/KEL System. In: Big Data Technologies and Applications 2016 (pp. 347-385). Springer, Cham) to successfully model Corona spread, we will obtain new results, and help in reducing the number of Corona patients. We closely collaborated with LexisNexis, which is a leading US data analytics company and a member of our NSF I/UCRC for Advanced Knowledge Enablement. The lack of a comprehensive view and informative analysis of the status of the pandemic can also cause panic and instability within society. Our work proposes the HPCC Systems Covid-19 tracker, which provides a multi-level view of the pandemic with the informative virus spreading indicators in a timely manner. The system embeds a classical epidemiological model known as SIR and spreading indicators based on causal model. The data solution of the tracker is built on top of the Big Data processing platform HPCC Systems, from ingesting and tracking of various data sources to fast delivery of the data to the public. The HPCC Systems Covid-19 tracker presents the Covid-19 data on a daily, weekly, and cumulative basis up to global-level and down to the county-level. It also provides statistical analysis for each level such as new cases per 100,000 population. The primary analysis such as Contagion Risk and Infection State is based on causal model with a seven-day sliding window. Our work has been released as a publicly available website to the world and attracted a great volume of traffic. The project is open-sourced and available on GitHub. The system was developed on the LexisNexis HPCC Systems, which is briefly described in the paper. 
    more » « less
  5. Due to the growing concerns surrounding the COVID-19 pandemic, colleges and universities either canceled or remotely hosted their 2020 National Science Foundation Research Experience for Undergraduates (REU) programs. This analysis is part of a larger study examining the impact of these fully remote experiences on professional and psychosocial factors such as mentees' sense of belonging, identity, and self-efficacy and their retention in STEM degree programs. We present a single-student case study and describe the dramaturgical analysis which centers on identifying five fundamental constructs within the data: objectives, conflicts, tactics, attitudes, and emotions. These items investigate what the participant in the remote REU program experienced and how this experience changed the ways in which he thinks about his future career decision-making. Our analysis explored four different sub-narratives: lack of community in virtual REU, mentor support, perception of the "real" nature of the experience in a virtual format, and future career decision-making. The mentee reported that this experience was highly beneficial and that he developed a sense of belonging and identity, despite working remotely -- often from his own bedroom. 
    more » « less