skip to main content


Title: Pruning digital contact networks for meso-scale epidemic surveillance using foursquare data
With the recent advances in human sensing, the push to integrate human mobility tracking with epidemic modeling highlights the lack of groundwork at the mesoscale (e.g., city-level) for both contact tracing and transmission dynamics. Although GPS data has been used to study city-level outbreaks in the past, existing approaches fail to capture the path of infection at the individual level. Consequently, in this paper, we extend epidemics prediction from estimating the size of an outbreak at the population level to estimating the individuals who may likely get infected within a finite period of time. To this end, we propose a network science based method to first build and then prune the dynamic contact networks for recurring interactions; these networks can serve as the backbone topology for mechanistic epidemics modeling. We test our method using Foursquare’s Points of Interest (POI) smart phone geolocation data from over 1.3 million devices to better approximate the COVID-19 infection curves for two major (yet very different) US cities, (i.e., Austin and New York City), while maintaining the granularity of individual transmissions and reducing model uncertainty. Our method provides a foundation for building a disease prediction framework at the mesoscale that can help both policy makers and individuals better understand their estimated state of health and help the pandemic mitigation efforts.  more » « less
Award ID(s):
2107085
NSF-PAR ID:
10356238
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM '21)
Page Range / eLocation ID:
423 to 430
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Moreno, Yamir (Ed.)
    Testing, contact tracing, and isolation (TTI) is an epidemic management and control approach that is difficult to implement at scale because it relies on manual tracing of contacts. Exposure notification apps have been developed to digitally scale up TTI by harnessing contact data obtained from mobile devices; however, exposure notification apps provide users only with limited binary information when they have been directly exposed to a known infection source. Here we demonstrate a scalable improvement to TTI and exposure notification apps that uses data assimilation (DA) on a contact network. Network DA exploits diverse sources of health data together with the proximity data from mobile devices that exposure notification apps rely upon. It provides users with continuously assessed individual risks of exposure and infection, which can form the basis for targeting individual contact interventions. Simulations of the early COVID-19 epidemic in New York City are used to establish proof-of-concept. In the simulations, network DA identifies up to a factor 2 more infections than contact tracing when both harness the same contact data and diagnostic test data. This remains true even when only a relatively small fraction of the population uses network DA. When a sufficiently large fraction of the population (≳ 75%) uses network DA and complies with individual contact interventions, targeting contact interventions with network DA reduces deaths by up to a factor 4 relative to TTI. Network DA can be implemented by expanding the computational backend of existing exposure notification apps, thus greatly enhancing their capabilities. Implemented at scale, it has the potential to precisely and effectively control future epidemics while minimizing economic disruption. 
    more » « less
  2. null (Ed.)
    The prevalence of mobile phones and wearable devices enables the passive capturing and modeling of human behavior at an unprecedented resolution and scale. Past research has demonstrated the capability of mobile sensing to model aspects of physical health, mental health, education, and work performance, etc. However, most of the algorithms and models proposed in previous work follow a one-size-fits-all (i.e., population modeling) approach that looks for common behaviors amongst all users, disregarding the fact that individuals can behave very differently, resulting in reduced model performance. Further, black-box models are often used that do not allow for interpretability and human behavior understanding. We present a new method to address the problems of personalized behavior classification and interpretability, and apply it to depression detection among college students. Inspired by the idea of collaborative-filtering, our method is a type of memory-based learning algorithm. It leverages the relevance of mobile-sensed behavior features among individuals to calculate personalized relevance weights, which are used to impute missing data and select features according to a specific modeling goal (e.g., whether the student has depressive symptoms) in different time epochs, i.e., times of the day and days of the week. It then compiles features from epochs using majority voting to obtain the final prediction. We apply our algorithm on a depression detection dataset collected from first-year college students with low data-missing rates and show that our method outperforms the state-of-the-art machine learning model by 5.1% in accuracy and 5.5% in F1 score. We further verify the pipeline-level generalizability of our approach by achieving similar results on a second dataset, with an average improvement of 3.4% across performance metrics. Beyond achieving better classification performance, our novel approach is further able to generate personalized interpretations of the models for each individual. These interpretations are supported by existing depression-related literature and can potentially inspire automated and personalized depression intervention design in the future 
    more » « less
  3. Abstract

    Network analysis of infectious disease in wildlife can reveal traits or individuals critical to pathogen transmission and help inform disease management strategies. However, estimates of contact between animals are notoriously difficult to acquire. Researchers commonly use telemetry technologies to identify animal associations, but such data may have different sampling intervals and often captures a small subset of the population. The objectives of this study were to outline best practices for telemetry sampling in network studies of infectious disease by determining (a) the consequences of telemetry sampling on our ability to estimate network structure, (b) whether contact networks can be approximated using purely spatial contact definitions and (c) how wildlife spatial configurations may influence telemetry sampling requirements.

    We simulated individual movement trajectories for wildlife populations using a home range‐like movement model, creating full location datasets and corresponding ‘complete’ networks. To mimic telemetry data, we created ‘sample’ networks by subsampling the population (10%–100% of individuals) with a range of sampling intervals (every minute to every 3 days). We varied the definition of contact for sample networks, using either spatiotemporal or spatial overlap, and varied the spatial configuration of populations (random, lattice or clustered). To compare complete and sample networks, we calculated seven network metrics important for disease transmission and assessed mean ranked correlation coefficients and percent error between complete and sample network metrics.

    Telemetry sampling severely reduced our ability to calculate global node‐level network metrics, but had less impact on local and network‐level metrics. Even so, in populations with infrequent associations, high intensity telemetry sampling may still be necessary. Defining contact in terms of spatial overlap generally resulted in overly connected networks, but in some instances, could compensate for otherwise coarse telemetry data.

    By synthesizing movement and disease ecology with computational approaches, we characterized trade‐offs important for using wildlife telemetry data beyond ecological studies of individual movement, and found that careful use of telemetry data has the potential to inform network models. Thus, with informed application of telemetry data, we can make significant advances in leveraging its use for a better understanding and management of wildlife infectious disease.

     
    more » « less
  4. Abstract

    Mass testing is essential for identifying infected individuals during an epidemic and allowing healthy individuals to return to normal social activities. However, testing capacity is often insufficient to meet global health needs, especially during newly emerging epidemics. Dorfman’s method, a classic group testing technique, helps reduce the number of tests required by pooling the samples of multiple individuals into a single sample for analysis. Dorfman’s method does not consider the time dynamics or limits on testing capacity involved in infection detection, and it assumes that individuals are infected independently, ignoring community correlations. To address these limitations, we present an adaptive group testing (AGT) strategy based on graph partitioning, which divides a physical contact network into subgraphs (groups of individuals) and assigns testing priorities based on the social contact characteristics of each subgraph. Our AGT aims to maximize the number of infected individuals detected and minimize the number of tests required. After each testing round (perhaps on a daily basis), the testing priority is increased for each neighboring group of known infected individuals. We also present an enhanced infectious disease transmission model that simulates the dynamic spread of a pathogen and evaluate our AGT strategy using the simulation results. When applied to 13 social contact networks, AGT demonstrates significant performance improvements compared to Dorfman’s method and its variations. Our AGT strategy requires fewer tests overall, reduces disease spread, and retains robustness under changes in group size, testing capacity, and other parameters. Testing plays a crucial role in containing and mitigating pandemics by identifying infected individuals and helping to prevent further transmission in families and communities. By identifying infected individuals and helping to prevent further transmission in families and communities, our AGT strategy can have significant implications for public health, providing guidance for policymakers trying to balance economic activity with the need to manage the spread of infection.

     
    more » « less
  5. Abstract Background

    Targeted surveillance allows public health authorities to implement testing and isolation strategies when diagnostic resources are limited, and can be implemented via the consideration of social network topologies. However, it remains unclear how to implement such surveillance and control when network data are unavailable.

    Methods

    We evaluated the ability of sociodemographic proxies of degree centrality to guide prioritized testing of infected individuals compared to known degree centrality. Proxies were estimated via readily available sociodemographic variables (age, gender, marital status, educational attainment, household size). We simulated severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) epidemics via a susceptible-exposed-infected-recovered individual-based model on 2 contact networks from rural Madagascar to test applicability of these findings to low-resource contexts.

    Results

    Targeted testing using sociodemographic proxies performed similarly to targeted testing using known degree centralities. At low testing capacity, using proxies reduced infection burden by 22%–33% while using 20% fewer tests, compared to random testing. By comparison, using known degree centrality reduced the infection burden by 31%–44% while using 26%–29% fewer tests.

    Conclusions

    We demonstrate that incorporating social network information into epidemic control strategies is an effective countermeasure to low testing capacity and can be implemented via sociodemographic proxies when social network data are unavailable.

     
    more » « less