skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on February 13, 2026

Title: End-to-end Trajectory Generation - Contrasting Deep Generative Models and Language Models
Due to the limited availability of actual large-scale datasets, realistic synthetic trajectory data play a crucial role in various research domains, including spatiotemporal data mining and data management, and domain-driven research related to transportation planning and urban analytics. Existing generation methods rely on predefined heuristics and cannot learn the unknown underlying generative mechanisms. This work introduces two end-to-end approaches for trajectory generation. The first approach comprises deep generative VAE-like models that factorize global and local semantics (habits vs. random routing change). We further enhance this approach by developing novel inference strategies based on variational inference and constrained optimization to ensure the validity of spatiotemporal aspects. This novel deep neural network architecture implements generative and inference models with dynamic latent priors. The second approach introduces a language model (LM) inspired generation as another benchmarking and foundational approach. The LM-inspired approach conceptualizes trajectories as sentences with the aim of predicting the likelihood of subsequent locations on a trajectory, given the locations as context. As a result, the LM-inspired approach implicitly learns the inherent spatiotemporal structure and other embedded semantics within the trajectories. These proposed methods demonstrate substantial quantitative and qualitative improvements over existing approaches, as evidenced by extensive experimental evaluations.  more » « less
Award ID(s):
2403312 2318831 2127901 2113350
PAR ID:
10588501
Author(s) / Creator(s):
; ; ; ;
Publisher / Repository:
ACM
Date Published:
Journal Name:
ACM Transactions on Spatial Algorithms and Systems
ISSN:
2374-0353
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Mobile devices have been an integral part of our everyday lives. Users' increasing interaction with mobile devices brings in significant concerns on various types of potential privacy leakage, among which location privacy draws the most attention. Specifically, mobile users' trajectories constructed by location data may be captured by adversaries to infer sensitive information. In previous studies, differential privacy has been utilized to protect published trajectory data with rigorous privacy guarantee. Strong protection provided by differential privacy distorts the original locations or trajectories using stochastic noise to avoid privacy leakage. In this paper, we propose a novel location inference attack framework, iTracker, which simultaneously recovers multiple trajectories from differentially private trajectory data using the structured sparsity model. Compared with the traditional recovery methods based on single trajectory prediction, iTracker, which takes advantage of the correlation among trajectories discovered by the structured sparsity model, is more effective in recovering multiple private trajectories simultaneously. iTracker successfully attacks the existing privacy protection mechanisms based on differential privacy. We theoretically demonstrate the near-linear runtime of iTracker, and the experimental results using two real-world datasets show that iTracker outperforms existing recovery algorithms in recovering multiple trajectories. 
    more » « less
  2. Human mobility modeling from GPS-trajectories and synthetic trajectory generation are crucial for various applications, such as urban planning, disaster management and epidemiology. Both of these tasks often require filling gaps in a partially specified sequence of visits, – a new problem that we call “controlled” synthetic trajectory generation. Existing methods for next-location prediction or synthetic trajectory generation cannot solve this problem as they lack the mechanisms needed to constrain the generated sequences of visits. Moreover, existing approaches (1) frequently treat space and time as independent factors, an assumption that fails to hold true in real-world scenarios, and (2) suffer from challenges in accuracy of temporal prediction as they fail to deal with mixed distributions and the inter-relationships of different modes with latent variables (e.g., day-of-the-week). These limitations become even more pronounced when the task involves filling gaps within sequences instead of solely predicting the next visit. We introduce TrajGPT, a transformer-based, multi-task, joint spatiotemporal generative model to address these issues. Taking inspiration from large language models, TrajGPT poses the problem of controlled trajectory generation as that of text infilling in natural language. TrajGPT integrates the spatial and temporal models in a transformer architecture through a Bayesian probability model that ensures that the gaps in a visit sequence are filled in a spatiotemporally consistent manner. Our experiments on public and private datasets demonstrate that TrajGPT not only excels in controlled synthetic visit generation but also outperforms competing models in next-location prediction tasks–Relatively, TrajGPT achieves a 26-fold improvement in temporal accuracy while retaining more than 98% of spatial accuracy on average. 
    more » « less
  3. apid growth of high-dimensional datasets in fields such as single-cell RNA sequencing and spatial genomics has led to unprecedented opportunities for scientific discovery, but it also presents unique computational and statistical challenges. Traditional methods struggle with geometry-aware data generation, interpolation along meaningful trajectories, and transporting populations via feasible paths. To address these issues, we introduce Geometry-Aware Generative Autoencoder (GAGA), a novel framework that combines extensible manifold learning with generative modeling. GAGA constructs a neural network embedding space that respects the intrinsic geometries discovered by manifold learning and learns a novel warped Riemannian metric on the data space. This warped metric is derived from both the points on the data manifold and negative samples off the manifold, allowing it to characterize a meaningful geometry across the entire latent space. Using this metric, GAGA can uniformly sample points on the manifold, generate points along geodesics, and interpolate between populations across the learned manifold. GAGA shows competitive performance in simulated and real-world datasets, including a 30% improvement over SOTA in single-cell population-level trajectory inference. 
    more » « less
  4. Learning explicit and implicit patterns in human trajectories plays an important role in many Location-Based Social Networks (LBSNs) applications, such as trajectory classification (e.g., walking, driving, etc.), trajectory-user linking, friend recommendation, etc. A particular problem that has attracted much attention recently – and is the focus of our work – is the Trajectory-based Social Circle Inference (TSCI), aiming at inferring user social circles (mainly social friendship) based on motion trajectories and without any explicit social networked information. Existing approaches addressing TSCI lack satisfactory results due to the challenges related to data sparsity, accessibility and model efficiency. Motivated by the recent success of machine learning in trajectory mining, in this paper we formulate TSCI as a novel multi-label classification problem and develop a Recurrent Neural Network (RNN)-based framework called DeepTSCI to use human mobility patterns for inferring corresponding social circles. We propose three methods to learn the latent representations of trajectories, based on: (1) bidirectional Long Short-Term Memory (LSTM); (2) Autoencoder; and (3) Variational autoencoder. Experiments conducted on real-world datasets demonstrate that our proposed methods perform well and achieve significant improvement in terms of macro-R, macro-F1 and accuracy when compared to baselines. 
    more » « less
  5. We present a novel generative Session-Based Recommendation (SBR) framework, called VAriational SEssion-based Recommendation (VASER) – a non-linear probabilistic methodology allowing Bayesian inference for flexible parameter estimation of sequential recommendations. Instead of directly applying extended Variational AutoEncoders (VAE) to SBR, the proposed method introduces normalizing flows to estimate the probabilistic posterior, which is more effective than the agnostic presumed prior approximation used in existing deep generative recommendation approaches. VASER explores soft attention mechanism to upweight the important clicks in a session. We empirically demonstrate that the proposed model significantly outperforms several state-of-the-art baselines, including the recently-proposed RNN/VAE-based approaches on real-world datasets. 
    more » « less