NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Synthetic population generation with public health characteristics for spatial agent-based models

Von_Hoene, E; Roess, A; Kavak, H; Anderson, T (March 2025, PLOS computational biology)

Free, publicly-accessible full text available March 21, 2026
Commuting flow prediction using OpenStreetMap data

Singh_Atwal, Kuldip; Anderson, T; Pfoser, D; Zufle, A (January 2025, Computational urban science)

Full Text Available
An Infectious Disease Spread Simulation to Control Data Bias

https://doi.org/10.1145/3678717.3691293

Kong, Ruochen; Anderson, Taylor; Heslop, David; Zufle, Andreas (October 2024, ACM)

The increased availability of datasets during the COVID-19 pandemic enabled machine-learning approaches for modeling and forecasting infectious diseases. However, such approaches are known to amplify the bias in the data they are trained on. Bias in such input data like clinical case data for COVID-19 is difficult to measure due to disparities in testing availability, reporting standards, and healthcare access among different populations and regions. Furthermore, the way such biases may propagate through the modeling pipeline to decision-making is relatively unknown. Therefore, we present a system that leverages a highly detailed agent-based model (ABM) of infectious disease spread in a city to simulate the collection of biased clinical case data where the bias is known. Our system allows users to load either a pre-selected region or select their own (using OpenStreetMap data for the environment and census data for the population), specify population and infectious disease parameters, and the degree(s) to which different populations will be overrepresented or underrepresented in the case data. In addition to the system, we provide a large number of benchmark datasets that produce case data at different levels of bias for different regions. Wehope that infectious disease modelers will use these datasets to investigate how well their models are robust to data bias or whether their model is overfit to biased data.
more » « less
Full Text Available
Urban Anomalies: A Simulated Human Mobility Dataset with Injected Anomalies

https://doi.org/10.1145/3681765.3698459

Amiri, Hossein; Kong, Ruochen; Züfle, Andreas (October 2024, ACM)

Human mobility anomaly detection based on location is essential in areas such as public health, safety, welfare, and urban planning. Developing models and approaches for location-based anomaly detection requires a comprehensive dataset. However, privacy concerns and the absence of ground truth hinder the availability of publicly available datasets. With this paper, we provide extensive simulated human mobility datasets featuring various anomaly types created using an existing Urban Patterns of Life Simulation. To create these datasets, we inject changes in the logic of individual agents to change their behavior. Specifically, we create four of anomalous agent behavior by (1) changing the agents’ appetite (causing agents to have meals more frequently), (2) changing their group of interest (causing agents to interact with different agents from another group). (3) changing their social place selection (causing agents to visit different recreational places) and (4) changing their work schedule (causing agents to skip work), For each type of anomaly, we use three degrees of behavioral change to tune the difficulty of detecting the anomalous agents. To select agents to inject anomalous behavior into, we employ three methods: (1) Random selection using a centralized manipulation mechanism, (2) Spread based selection using an infectious disease model, and (3) through exposure of agents to a specific location. All datasets are split into normal and anomalous phases. The normal phase, which can be used for training models of normalcy, exhibits no anomalous behavior. The anomalous phase, which can be used for testing for anomalous detection algorithm, includes ground truth labels that indicate, for each five-minute simulation step, which agents are anomalous at that time. Datasets are generated using the maps (roads and buildings) for Atlanta and Berlin having 1k agents in each simulation. All datasets are openly available at https://osf.io/dg6t3/. Additionally, we provide instructions to regenerate the data for other locations and numbers of agents.
more » « less
Full Text Available
Large Language Models for Spatial Trajectory Patterns Mining

https://doi.org/10.1145/3681765.3698467

Zhang, Zheng; Amiri, Hossein; Liu, Zhenke; Zhao, Liang; Zuefle, Andreas (October 2024, ACM)

Identifying anomalous human spatial trajectory patterns can indicate dynamic changes in mobility behavior with applications in domains like infectious disease monitoring and elderly care. Recent advancements in large language models (LLMs) have demonstrated their ability to reason in a manner akin to humans. This presents significant potential for analyzing temporal patterns in human mobility. In this paper, we conduct empirical studies to assess the capabilities of leading LLMs like GPT-4 and Claude-2 in detecting anomalous behaviors from mobility data, by comparing to specialized methods. Our key findings demonstrate that LLMs can attain reasonable anomaly detection performance even without any specific cues. In addition, providing contextual clues about potential irregularities could further enhances their prediction efficacy. Moreover, LLMs can provide reasonable explanations for their judgments, thereby improving transparency. Our work provides insights on the strengths and limitations of LLMs for human spatial trajectory analysis.
more » « less
Full Text Available
Transferable Unsupervised Outlier Detection Framework for Human Semantic Trajectories

https://doi.org/10.1145/3678717.3691324

Zhang, Zheng; Amiri, Hossein; Yu, Dazhou; Hu, Yuntong; Zhao, Liang; Züfle, Andreas (October 2024, ACM)

Full Text Available
GeoLife+: Large-Scale Simulated Trajectory Datasets Calibrated to the GeoLife Dataset

https://doi.org/10.1145/3681770.3698573

Amiri, Hossein; Yang, Richard; Züfle, Andreas (October 2024, ACM)

Analyzing individual human trajectory data helps our understanding of human mobility and finds many commercial and academic applications. There are two main approaches to accessing trajectory data for research: one involves using real-world datasets like GeoLife, while the other employs simulations to synthesize data. Real-world data provides insights from real human activities, but such data is generally sparse due to voluntary participation. Conversely, simulated data can be more comprehensive but may capture unrealistic human behavior. In this Data and Resource paper, we combine the benefit of both by leveraging the statistical features of real-world data and the comprehensiveness of simulated data. Specifically, we extract features from the real-world GeoLife dataset such as the average number of individual daily trips, average radius of gyration, and maximum and minimum trip distances. We calibrate the Pattern of Life Simulation, a realistic simulation of human mobility, to reproduce these features. Therefore, we use a genetic algorithm to calibrate the parameters of the simulation to mimic the GeoLife features. For this calibration, we simulated numerous random simulation settings, measured the similarity of generated trajectories to GeoLife, and iteratively (over many generations) combined parameter settings of trajectory datasets most similar to GeoLife. Using the calibrated simulation, we simulate large trajectory datasets that we call GeoLife+, where + denotes the Kleene Plus, indicating unlimited replication with at least one occurrence. We provide simulated GeoLife+ data with 182, 1k, and 5k over 5 years, 10k, and 50k over a year and 100k users over 6 months of simulation lifetime.
more » « less
Full Text Available
Human Mobility Challenge: Are Transformers Effective for Human Mobility Prediction?

https://doi.org/10.1145/3681771.3700130

Kong, Ruochen; Amiri, Hossein; Liu, Yueyang; Kennedy, Lance; Gupta, Misha; Kim, Joon-Seok; Züfle, Andreas (October 2024, ACM)

Transformer-based models are popular for time series forecasting and spatiotemporal prediction due to their ability to infer semantic correlations in long sequences. However, for human mobility prediction, temporal correlations, such as location patterns at the same time on previous days or weeks, are essential. While positional encodings help retain order, the self-attention mechanism causes a loss of temporal detail. To validate this claim, we used a simple approach in the 2nd ACM SIGSPATIAL Human Mobility Prediction Challenge, predicting locations based on past patterns weighted by reliability scores for missing data. Our simple approach was among the top 10 competitors and significantly outperformed the Transformer-based model that won the 2023 challenge.
more » « less
Full Text Available
In Silico Human Mobility Data Science: Leveraging Massive Simulated Mobility Data (Vision Paper)

https://doi.org/10.1145/3672557

Züfle, Andreas; Pfoser, Dieter; Wenk, Carola; Crooks, Andrew; Kavak, Hamdi; Anderson, Taylor; Kim, Joon-Seok; Holt, Nathan; Diantonio, Andrew (June 2024, ACM Transactions on Spatial Algorithms and Systems)

Human mobility data science using trajectories or check-ins of individuals has many applications. Recently, we have seen a plethora of research efforts that tackle these applications. However, research progress in this field is limited by a lack of large and representative datasets. The largest and most commonly used dataset of individual human trajectories captures fewer than 200 individuals, while datasets of individual human check-ins capture fewer than 100 check-ins per city per day. Thus, it is not clear if findings from the human mobility data science community would generalize to large populations. Since obtaining massive, representative, and individual-level human mobility data is hard to come by due to privacy considerations, the vision of this work is to embrace the use of data generated by large-scale socially realistic microsimulations. Informed by both real data and leveraging social and behavioral theories, massive spatially explicit microsimulations may allow us to simulate entire megacities at the person level. The simulated worlds, which do not capture any identifiable personal information, allow us to perform “in silico” experiments using the simulated world as a sandbox in which we have perfect information and perfect control without jeopardizing the privacy of any actual individual. In silico experiments have become commonplace in other scientific domains such as chemistry and biology, permitting experiments that foster the understanding of concepts without any harm to individuals. This work describes challenges and opportunities for leveraging massive and realistic simulated alternate worlds for in silico human mobility data science.
more » « less
Full Text Available
Vaccine Attitudes and Uptake Among Latino Residents of a Former COVID-19 Hotspot

https://doi.org/10.1353/hpu.2024.a919821

Cleaveland, Carol; Anderson, Taylor; McNally, Kimberly; Roess, Amira A. (February 2024, Journal of Health Care for the Poor and Underserved)

Full Text Available

« Prev Next »

Search for: All records