skip to main content

Title: From 5Vs to 6Cs: Operationalizing Epidemic Data Management with COVID-19 Surveillance
The COVID-19 pandemic brought to the forefront an unprecedented need for experts, as well as citizens, to visualize spatio-temporal disease surveillance data. Web application dashboards were quickly developed to fill this gap, including those built by JHU, WHO, and CDC, but all of these dashboards supported a particular niche view of the pandemic (ie, current status or specific regions). In this paper1, we describe our work developing our own COVID-19 Surveillance Dashboard, available at, which offers a universal view of the pandemic while also allowing users to focus on the details that interest them. From the beginning, our goal was to provide a simple visual way to compare, organize, and track near-real-time surveillance data as the pandemic progresses. Our dashboard includes a number of advanced features for zooming, filtering, categorizing and visualizing multiple time series on a single canvas. In developing this dashboard, we have also identified 6 key metrics we call the 6Cs standard which we propose as a standard for the design and evaluation of real-time epidemic science dashboards. Our dashboard was one of the first released to the public, and remains one of the most visited and highly used. Our group uses it to support federal, state and local more » public health authorities, and it is used by people worldwide to track the pandemic evolution, build their own dashboards, and support their organizations as they plan their responses to the pandemic. We illustrate the utility of our dashboard by describing how it can be used to support data story-telling – an important emerging area in data science. « less
; ; ; ; ; ; ; ; ; ; ;
Award ID(s):
1633028 1443054 1916805
Publication Date:
Journal Name:
Sponsoring Org:
National Science Foundation
More Like this
  1. The outbreak of the novel coronavirus, COVID-19, has become one of the most severe pandemics in human history. In this paper, we propose to leverage social media users as social sensors to simultaneously predict the pandemic trends and suggest potential risk factors for public health experts to understand spread situations and recommend proper interventions. More precisely, we develop novel deep learning models to recognize important entities and their relations over time, thereby establishing dynamic heterogeneous graphs to describe the observations of social media users. A dynamic graph neural network model can then forecast the trends (e.g. newly diagnosed cases andmore »death rates) and identify high-risk events from social media. Based on the proposed computational method, we also develop a web-based system for domain experts without any computer science background to easily interact with. We conduct extensive experiments on large-scale datasets of COVID-19 related tweets provided by Twitter, which show that our method can precisely predict the new cases and death rates. We also demonstrate the robustness of our web-based pandemic surveillance system and its ability to retrieve essential knowledge and derive accurate predictions across a variety of circumstances. Our system is also available at . This article is part of the theme issue ‘Data science approachs to infectious disease surveillance’.« less
  2. As countries look toward re-opening of economic activities amidst the ongoing COVID-19 pandemic, ensuring public health has been challenging. While contact tracing only aims to track past activities of infected users, one path to safe reopening is to develop reliable spatiotemporal risk scores to indicate the propensity of the disease. Existing works which aim at developing risk scores either rely on compartmental model-based reproduction numbers (which assume uniform population mixing) or develop coarse-grain spatial scores based on reproduction number (R0) and macro-level density-based mobility statistics. Instead, in this article, we develop a Hawkes process-based technique to assign relatively fine-grain spatialmore »and temporal risk scores by leveraging high-resolution mobility data based on cell-phone originated location signals. While COVID-19 risk scores also depend on a number of factors specific to an individual, including demography and existing medical conditions, the primary mode of disease transmission is via physical proximity and contact. Therefore, we focus on developing risk scores based on location density and mobility behaviour. We demonstrate the efficacy of the developed risk scores via simulation based on real-world mobility data. Our results show that fine-grain spatiotemporal risk scores based on high-resolution mobility data can provide useful insights and facilitate safe re-opening.« less
  3. This research-track work-in-progress paper contributes to engineering education by documenting progress in developing a new standard Engineering Computational Thinking Diagnostic to measure engineering student success in five factors of computational thinking. Over the past year, results from an initial validation attempt were used to refine diagnostic questions. A second statistical validation attempt was then completed in Spring 2021 with 191 student participants at three universities. Statistics show that all diagnostic questions had statistically significant factor loadings onto one general computational thinking factor that incorporates the five original factors of (a) Abstraction, (b) Algorithmic Thinking, (c) Decomposition, (d) Data Representation andmore »Organization, and (e) Impact of Computing. This result was unexpected as our goal was a diagnostic that could discriminate among the five factors. A small population size caused by the virtual delivery of courses during the COVID-19 pandemic may be the explanation and a third round of validation in Fall 2021 is expected to result in a larger population given the return to face-to-face instruction. When statistical validation is completed, the diagnostic will help institutions identify students with strong entry level skills in computational thinking as well as students that require academic support. The diagnostic will inform curriculum design by demonstrating which factors are more accessible to engineering students and which factors need more time and focus in the classroom. The long-term impact of a successfully validated computational thinking diagnostic will be introductory engineering courses that better serve engineering students coming from many backgrounds. This can increase student self- efficacy, improve student retention, and improve student enculturation into the engineering profession. Currently, the diagnostic identifies general computational thinking skill« less
  4. The deployment of vaccines across the US provides significant defense against serious illness and death from COVID-19. Over 70% of vaccine-eligible Americans are at least partially vaccinated, but there are pockets of the population that are under-vaccinated, such as in rural areas and some demographic groups (e.g. age, race, ethnicity). These unvaccinated pockets are extremely susceptible to the Delta variant, exacerbating the healthcare crisis and increasing the risk of new variants. In this paper, we describe a data-driven model that provides real-time support to Virginia public health officials by recommending mobile vaccination site placement in order to target under-vaccinated populations.more »Our strategy uses fine-grained mobility data, along with US Census and vaccination uptake data, to identify locations that are most likely to be visited by unvaccinated individuals. We further extend our model to choose locations that maximize vaccine uptake among hesitant groups. We show that the top recommended sites vary substantially across some demographics, demonstrating the value of developing customized recommendation models that integrate fine-grained, heterogeneous data sources. In addition, we used a statistically equivalent Synthetic Population to study the effect of combined demographics (eg, people of a particular race and age), which is not possible using US Census data alone. We validate our recommendations by analyzing the success rates of deployed vaccine sites, and show that sites placed closer to our recommended areas administered higher numbers of doses. Our model is the first of its kind to consider evolving mobility patterns in real-time for suggesting placement strategies customized for different targeted demographic groups. Our results will be presented at IAAI-22, but given the critical nature of the pandemic, we offer this extended version of that paper for more timely consideration of our approach and to cover additional findings.« less
  5. We study allocation of COVID-19 vaccines to individuals based on the structural properties of their underlying social contact network. Even optimistic estimates suggest that most countries will likely take 6 to 24 months to vaccinate their citizens. These time estimates and the emergence of new viral strains urge us to find quick and effective ways to allocate the vaccines and contain the pandemic. While current approaches use combinations of age-based and occupation-based prioritizations, our strategy marks a departure from such largely aggregate vaccine allocation strategies. We propose a novel agent-based modeling approach motivated by recent advances in (i) science ofmore »real-world networks that point to efficacy of certain vaccination strategies and (ii) digital technologies that improve our ability to estimate some of these structural properties. Using a realistic representation of a social contact network for the Commonwealth of Virginia, combined with accurate surveillance data on spatio-temporal cases and currently accepted models of within- and between-host disease dynamics, we study how a limited number of vaccine doses can be strategically distributed to individuals to reduce the overall burden of the pandemic. We show that allocation of vaccines based on individuals' degree (number of social contacts) and total social proximity time is signi ficantly more effective than the currently used age-based allocation strategy in terms of number of infections, hospitalizations and deaths. Our results suggest that in just two months, by March 31, 2021, compared to age-based allocation, the proposed degree-based strategy can result in reducing an additional 56{110k infections, 3.2{5.4k hospitalizations, and 700{900 deaths just in the Commonwealth of Virginia. Extrapolating these results for the entire US, this strategy can lead to 3{6 million fewer infections, 181{306k fewer hospitalizations, and 51{62k fewer deaths compared to age-based allocation. The overall strategy is robust even: (i) if the social contacts are not estimated correctly; (ii) if the vaccine efficacy is lower than expected or only a single dose is given; (iii) if there is a delay in vaccine production and deployment; and (iv) whether or not non-pharmaceutical interventions continue as vaccines are deployed. For reasons of implementability, we have used degree, which is a simple structural measure and can be easily estimated using several methods, including the digital technology available today. These results are signi ficant, especially for resource-poor countries, where vaccines are less available, have lower efficacy, and are more slowly distributed.« less