skip to main content


Title: Spatially weighted structural similarity index: a multiscale comparison tool for diverse sources of mobility data
Data collected about routine human activity and mobility is used in diverse applications to improve our society. Robust models are needed to address the challenges of our increasingly interconnected world. Methods capable of portraying the dynamic properties of complex human systems, such as simulation modeling, must comply to rigorous data requirements. Modern data sources, like SafeGraph, provide aggregate data collected from location aware technologies. Opportunities and challenges arise to incorporate the new data into existing analysis and modeling methods. Our research employs a multiscale spatial similarity index to compare diverse origin-destination mobility datasets. Established distance ranges accommodate spatial variability in the model’s datasets. This paper explores how similarity scores change with different aggregations to address discrepancies in the source data’s temporal granularity. We suggest possible explanations for variations in the similarity scores and extract characteristics of human mobility for the study area. The multiscale spatial similarity index may be integrated into a vast array of analysis and modeling workflows, either during preliminary analysis or later evaluation phases as a method of data validation (e.g., agent-based models). We propose that the demonstrated tool has potential to enhance mobility modeling methods in the context of complex human systems.  more » « less
Award ID(s):
2031407
NSF-PAR ID:
10381900
Author(s) / Creator(s):
; ;
Editor(s):
Ossi, Federico; Hachem, Fatima; Robira, Benjamin; Ellis Soto, Diego; Rutz, Christian; Dodge, Somayeh; Cagnacci, Francesca; Damiani, Maria Luisa
Date Published:
Journal Name:
Proceedings of the 2nd ACM SIGSPATIAL International Workshop on Animal Movement Ecology and Human Mobility
Page Range / eLocation ID:
19 to 22
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Background

    The use of systems science methodologies to understand complex environmental and human health relationships is increasing. Requirements for advanced datasets, models, and expertise limit current application of these approaches by many environmental and public health practitioners.

    Methods

    A conceptual system-of-systems model was applied for children in North Carolina counties that includes example indicators of children’s physical environment (home age, Brownfield sites, Superfund sites), social environment (caregiver’s income, education, insurance), and health (low birthweight, asthma, blood lead levels). The web-based Toxicological Prioritization Index (ToxPi) tool was used to normalize the data, rank the resulting vulnerability index, and visualize impacts from each indicator in a county. Hierarchical clustering was used to sort the 100 North Carolina counties into groups based on similar ToxPi model results. The ToxPi charts for each county were also superimposed over a map of percentage county population under age 5 to visualize spatial distribution of vulnerability clusters across the state.

    Results

    Data driven clustering for this systems model suggests 5 groups of counties. One group includes 6 counties with the highest vulnerability scores showing strong influences from all three categories of indicators (social environment, physical environment, and health). A second group contains 15 counties with high vulnerability scores driven by strong influences from home age in the physical environment and poverty in the social environment. A third group is driven by data on Superfund sites in the physical environment.

    Conclusions

    This analysis demonstrated how systems science principles can be used to synthesize holistic insights for decision making using publicly available data and computational tools, focusing on a children’s environmental health example. Where more traditional reductionist approaches can elucidate individual relationships between environmental variables and health, the study of collective, system-wide interactions can enable insights into the factors that contribute to regional vulnerabilities and interventions that better address complex real-world conditions.

     
    more » « less
  2. Abstract

    Human mobility analysis plays a crucial role in urban analysis, city planning, epidemic modeling, and even understanding neighborhood effects on individuals’ health. Often, these studies model human mobility in the form of co-location networks. We have recently seen the tremendous success of network representation learning models on several machine learning tasks on graphs. To the best of our knowledge, limited attention has been paid to identifying communities using network representation learning methods specifically for co-location networks. We attempt to address this problem and study user mobility behavior through the communities identified with latent node representations. Specifically, we select several diverse network representation learning models to identify communities from a real-world co-location network. We include both general-purpose representation models that make no assumptions on network modality as well as approaches designed specifically for human mobility analysis. We evaluate these different methods on data collected in the Adolescent Health and Development in Context study. Our experimental analysis reveals that a recently proposed method (LocationTrails) offers a competitive advantage over other methods with respect to its ability to represent and reflect community assignment that is consistent with extant findings regarding neighborhood racial and socio-economic differences in mobility patterns. We also compare the learned activity profiles of individuals by factoring in their residential neighborhoods. Our analysis reveals a significant contrast in the activity profiles of individuals residing in white-dominated versus black-dominated neighborhoods and advantaged versus disadvantaged neighborhoods in a major metropolitan city of United States. We provide a clear rationale for this contrastive pattern through insights from the sociological literature.

     
    more » « less
  3. null (Ed.)
    Background Human movement is one of the forces that drive the spatial spread of infectious diseases. To date, reducing and tracking human movement during the COVID-19 pandemic has proven effective in limiting the spread of the virus. Existing methods for monitoring and modeling the spatial spread of infectious diseases rely on various data sources as proxies of human movement, such as airline travel data, mobile phone data, and banknote tracking. However, intrinsic limitations of these data sources prevent us from systematic monitoring and analyses of human movement on different spatial scales (from local to global). Objective Big data from social media such as geotagged tweets have been widely used in human mobility studies, yet more research is needed to validate the capabilities and limitations of using such data for studying human movement at different geographic scales (eg, from local to global) in the context of global infectious disease transmission. This study aims to develop a novel data-driven public health approach using big data from Twitter coupled with other human mobility data sources and artificial intelligence to monitor and analyze human movement at different spatial scales (from global to regional to local). Methods We will first develop a database with optimized spatiotemporal indexing to store and manage the multisource data sets collected in this project. This database will be connected to our in-house Hadoop computing cluster for efficient big data computing and analytics. We will then develop innovative data models, predictive models, and computing algorithms to effectively extract and analyze human movement patterns using geotagged big data from Twitter and other human mobility data sources, with the goal of enhancing situational awareness and risk prediction in public health emergency response and disease surveillance systems. Results This project was funded as of May 2020. We have started the data collection, processing, and analysis for the project. Conclusions Research findings can help government officials, public health managers, emergency responders, and researchers answer critical questions during the pandemic regarding the current and future infectious risk of a state, county, or community and the effectiveness of social/physical distancing practices in curtailing the spread of the virus. International Registered Report Identifier (IRRID) DERR1-10.2196/24432 
    more » « less
  4. Abstract

    Policy interest in socio‐ecological systems has driven attempts to define and map socio‐ecological zones (SEZs), that is, spatial regions, distinguishable by their conjoined social and bio‐geo‐physical characteristics. The state of Idaho, USA, has a strong need for SEZ designations because of potential conflicts between rapidly increasing and impactful human populations, and proximal natural ecosystems. Our Idaho SEZs address analytical shortcomings in previously published SEZs by: (1) considering potential biases of clustering methods, (2) cross‐validating SEZ classifications, (3) measuring the relative importance of bio‐geo‐physical and social system predictors, and (4) considering spatial autocorrelation. We obtained authoritative bio‐geo‐physical and social system datasets for Idaho, aggregated into 5‐km grids = 25 km2, and decomposed these using principal components analyses (PCAs). PCA scores were classified using two clustering techniques commonly used in SEZ mapping: hierarchical clustering with Ward's linkage, andk‐means analysis. Classification evaluators indicated that eight‐ and five‐cluster solutions were optimal for the bio‐geo‐physical and social datasets for Ward's linkage, resulting in 31 SEZ composite types, and six‐ and five‐cluster solutions were optimal fork‐means analysis, resulting in 24 SEZ composite types. Ward's andk‐means solutions were similar for bio‐geo‐physical and social classifications with similar numbers of clusters. Further, both classifiers identified the same dominant SEZ composites. For rarer SEZs, however, classification methods strongly affected SEZ classifications, potentially altering land management perspectives. Our SEZs identify several critical regions of social–ecological overlap. These include suburban interface types and a high desert transition zone. Based on multinomial generalized linear models, bio‐geo‐physical information explained more variation in SEZs than social system data, after controlling for spatial autocorrelation, under both Ward's andk‐means approaches. Agreement (cross‐validation) levels were high for multinomial models with bio‐geo‐physical and social predictors for both Ward's andk‐means SEZs. A consideration of historical drivers, including indigenous social systems, and current trajectories of land and resource management in Idaho, indicates a strong need for rigorous SEZ designations to guide development and conservation in the region. Our analytical framework can be broadly applied in SES research and applied in other regions, when categorical responses—including cluster designations—have a spatial component.

     
    more » « less
  5. Abstract

    During the 21st century, human–environment interactions will increasingly expose both systems to risks, but also yield opportunities for improvement as we gain insight into these complex, coupled systems. Human–environment interactions operate over multiple spatial and temporal scales, requiring large data volumes of multi‐resolution information for analysis. Climate change, land‐use change, urbanization, and wildfires, for example, can affect regions differently depending on ecological and socioeconomic structures. The relative scarcity of data on both humans and natural systems at the relevant extent can be prohibitive when pursuing inquiries into these complex relationships. We explore the value of multitemporal, high‐density, and high‐resolution LiDAR, imaging spectroscopy, and digital camera data from the National Ecological Observatory Network’s Airborne Observation Platform (NEON AOP) for Socio‐Environmental Systems (SES) research. In addition to providing an overview of NEON AOP datasets and outlining specific applications for addressing SES questions, we highlight current challenges and provide recommendations for the SES research community to improve and expand its use of this platform for SES research. The coordinated, nationwide AOP remote sensing data, collected annually over the next 30 yr, offer exciting opportunities for cross‐site analyses and comparison, upscaling metrics derived from LiDAR and hyperspectral datasets across larger spatial extents, and addressing questions across diverse scales. Integrating AOP data with other SES datasets will allow researchers to investigate complex systems and provide urgently needed policy recommendations for socio‐environmental challenges. We urge the SES research community to further explore questions and theories in social and economic disciplines that might leverage NEON AOP data.

     
    more » « less