skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Simulating Urban Patterns of Life: A Geo-Social Data Generation Framework
Data generators have been heavily used in creating massive trajectory datasets to address common challenges of real-world datasets, including privacy, cost of data collection, and data quality. However, such generators often overlook social and physiological characteristics of individuals and as such their results are often limited to simple movement patterns. To address these shortcomings, we propose an agent-based simulation framework that facilitates the development of behavioral models in which agents correspond to individuals that act based on personal preferences, goals, and needs within a realistic geographical environment. Researchers can use a drag-and-drop interface to design and control their own world including the geospatial and social (i.e. geo-social) properties. The framework is capable of generating and streaming very large data that captures the basic patterns of life in urban areas. Streaming data from the simulation can be accessed in real time through a dedicated API.  more » « less
Award ID(s):
1637541 1637576
PAR ID:
10187146
Author(s) / Creator(s):
; ; ; ; ; ;
Date Published:
Journal Name:
Proceedings of the 27th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems
Page Range / eLocation ID:
576 to 579
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Location-based social networks (LBSNs) have been studied extensively in recent years. However, utilizing real-world LBSN data sets yields several weaknesses: sparse and small data sets, privacy concerns, and a lack of authoritative ground-truth. To overcome these weaknesses, we leverage a large-scale LBSN simulation to create a framework to simulate human behavior and to create synthetic but realistic LBSN data based on human patterns of life. Such data not only captures the location of users over time but also their interactions via social networks. Patterns of life are simulated by giving agents (i.e., people) an array of “needs” that they aim to satisfy, e.g., agents go home when they are tired, to restaurants when they are hungry, to work to cover their financial needs, and to recreational sites to meet friends and satisfy their social needs. While existing real-world LBSN data sets are trivially small, the proposed framework provides a source for massive LBSN benchmark data that closely mimics the real-world. As such, it allows us to capture 100% of the (simulated) population without any data uncertainty, privacy-related concerns, or incompleteness. It allows researchers to see the (simulated) world through the lens of an omniscient entity having perfect data. Our framework is made available to the community. In addition, we provide a series of simulated benchmark LBSN data sets using different synthetic towns and real-world urban environments obtained from OpenStreetMap. The simulation software and data sets, which comprise gigabytes of spatio-temporal and temporal social network data, are made available to the research community. 
    more » « less
  2. Generating realistic geospatial vector data is important for evaluat-ing algorithms, index structures, and systems under diverse condi-tions. Existing synthetic data generators typically rely on simplestatistical or procedural models that fail to capture the complexityof real-world spatial patterns. This paper introduces GeoGen I, agenerative framework that produces geospatial point distributionsfrom natural language prompts. The system combines contrastivelearning, region context, and a diffusion-based generator to createplausible datasets. In the experiments, we test variations of themodel and provide both qualitative and quantitative evaluations.Our experiments show that it can generate spatial patterns alignedwith different prompts. While the results are promising, many chal-lenges still remain, including in dataset curation and quality, andthe model’s ability to capture subtle geospatial constraints. 
    more » « less
  3. Location-based social networks (LBSNs) have been studied extensively in recent years. However, utilizing real-world LBSN datasets in such studies has severe weaknesses: sparse and small datasets, privacy concerns, and a lack of authoritative ground-truth. Our vision is to create a large scale geo-simulation framework to simulate human behavior and to create synthetic but realistic LBSN data that captures the location of users over time as well as social interactions of users in a social network. While existing LBSN datasets are trivially small, such a framework would provide the first source of massive LBSN benchmark data which would closely mimic the real world, containing high-fidelity information of location, and social connections of millions of simulated agents over several years of simulated time. Therefore, it would serve the research community by revitalizing and reshaping research on LBSNs by allowing researchers to see the (simulated) world through the lens of an omniscient entity having perfect data. These evaluations will guide future research enabling us to develop solutions to improve LBSN applications such as user-location recommendation, friend recommendation, location prediction, and location privacy. 
    more » « less
  4. Large-scale driving datasets such as Waymo Open Dataset and nuScenes substantially accelerate autonomous driving research, especially for perception tasks such as 3D detection and trajectory forecasting. Since the driving logs in these datasets contain HD maps and detailed object annotations that accurately reflect the real- world complexity of traffic behaviors, we can harvest a massive number of complex traffic scenarios and recreate their digital twins in simulation. Compared to the hand- crafted scenarios often used in existing simulators, data-driven scenarios collected from the real world can facilitate many research opportunities in machine learning and autonomous driving. In this work, we present ScenarioNet, an open-source platform for large-scale traffic scenario modeling and simulation. ScenarioNet defines a unified scenario description format and collects a large-scale repository of real-world traffic scenarios from the heterogeneous data in various driving datasets including Waymo, nuScenes, Lyft L5, Argoverse, and nuPlan datasets. These scenarios can be further replayed and interacted with in multiple views from Bird- Eye-View layout to realistic 3D rendering in MetaDrive simulator. This provides a benchmark for evaluating the safety of autonomous driving stacks in simulation before their real-world deployment. We further demonstrate the strengths of ScenarioNet on large-scale scenario generation, imitation learning, and reinforcement learning in both single-agent and multi-agent settings. Code, demo videos, and website are available at https://metadriverse.github.io/scenarionet. 
    more » « less
  5. Stress affects physical and mental health, and wearable devices have been widely used to detect daily stress through physiological signals. However, these signals vary due to factors such as individual differences and health conditions, making generalizing machine learning models difficult. To address these challenges, we present Human Heterogeneity Invariant Stress Sensing (HHISS), a domain generalization approach designed to find consistent patterns in stress signals by removing person-specific differences. This helps the model perform more accurately across new people, environments, and stress types not seen during training. Its novelty lies in proposing a novel technique called person-wise sub-network pruning intersection to focus on shared features across individuals, alongside preventing overfitting by leveraging continuous labels while training. The present study focuses on people with opioid use disorder (OUD)---a group where stress responses can change dramatically depending on the presents of opioids in their system, including daily timed medication for OUD (MOUD). Since stress often triggers cravings, a model that can adapt well to these changes could support better OUD rehabilitation and recovery. We tested HHISS on seven different stress datasets---four which we collected ourselves and three public datasets. Four are from lab setups, one from a controlled real-world driving setting, and two are from real-world in-the-wild field datasets with no constraints. The present study is the first known to evaluate how well a stress detection model works across such a wide range of data. Results show HHISS consistently outperformed state-of-the-art baseline methods, proving both effective and practical for real-world use. Ablation studies, empirical justifications, and runtime evaluations confirm HHISS's feasibility and scalability for mobile stress sensing in sensitive real-world applications. 
    more » « less