skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: TrajGAIL: Trajectory Generative Adversarial Imitation Learning for Long-Term Decision Analysis
Mobile sensing and information technology have enabled us to collect a large amount of mobility data from human decision-makers, for example, GPS trajectories from taxis, Uber cars, and passenger trip data of taking buses and trains. Understanding and learning human decision-making strategies from such data can potentially promote individual's well-being and improve the transportation service quality. Existing works on human strategy learning, such as inverse reinforcement learning, all model the decision-making process as a Markov decision process, thus assuming the Markov property. In this work, we show that such Markov property does not hold in real-world human decision-making processes. To tackle this challenge, we develop a Trajectory Generative Adversarial Imitation Learning (TrajGAIL) framework. It captures the long-term decision dependency by modeling the human decision processes as variable length Markov decision processes (VLMDPs), and designs a deep-neural-network-based framework to inversely learn the decision-making strategy from the human agent's historical dataset. We validate our framework using two real world human-generated spatial-temporal datasets including taxi driver passenger-seeking decision data and public transit trip data. Results demonstrate significant accuracy improvement in learning human decision-making strategies, when comparing to baselines with Markov property assumptions.  more » « less
Award ID(s):
1942680 1952085 1831140
PAR ID:
10225176
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
2020 IEEE International Conference on Data Mining (ICDM)
Page Range / eLocation ID:
801 to 810
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. To make daily decisions, human agents devise their own "strategies" governing their mobility dynamics (e.g., taxi drivers have preferred working regions and times, and urban commuters have preferred routes and transit modes). Recent research such as generative adversarial imitation learning (GAIL) demonstrates successes in learning human decision-making strategies from their behavior data using deep neural networks (DNNs), which can accurately mimic how humans behave in various scenarios, e.g., playing video games, etc. However, such DNN-based models are "black box" models in nature, making it hard to explain what knowledge the models have learned from human, and how the models make such decisions, which was not addressed in the literature of imitation learning. This paper addresses this research gap by proposing xGAIL, the first explainable generative adversarial imitation learning framework. The proposed xGAIL framework consists of two novel components, including Spatial Activation Maximization (SpatialAM) and Spatial Randomized Input Sampling Explanation (SpatialRISE), to extract both global and local knowledge from a well-trained GAIL model that explains how a human agent makes decisions. Especially, we take taxi drivers' passenger-seeking strategy as an example to validate the effectiveness of the proposed xGAIL framework. Our analysis on a large-scale real-world taxi trajectory data shows promising results from two aspects: i) global explainable knowledge of what nearby traffic condition impels a taxi driver to choose a particular direction to find the next passenger, and ii) local explainable knowledge of what key (sometimes hidden) factors a taxi driver considers when making a particular decision. 
    more » « less
  2. Abstract Partially Observable Markov Decision Processes (POMDPs) can model complex sequential decision-making problems under stochastic and uncertain environments. A main reason hindering their broad adoption in real-world applications is the unavailability of a suitable POMDP model or a simulator thereof. Available solution algorithms, such as Reinforcement Learning (RL), typically benefit from the knowledge of the transition dynamics and the observation generating process, which are often unknown and non-trivial to infer. In this work, we propose a combined framework for inference and robust solution of POMDPs via deep RL. First, all transition and observation model parameters are jointly inferred via Markov Chain Monte Carlo sampling of a hidden Markov model, which is conditioned on actions, in order to recover full posterior distributions from the available data. The POMDP with uncertain parameters is then solved via deep RL techniques with the parameter distributions incorporated into the solution via domain randomization, in order to develop solutions that are robust to model uncertainty. As a further contribution, we compare the use of Transformers and long short-term memory networks, which constitute model-free RL solutions and work directly on the observation space, with an approach termed the belief-input method, which works on the belief space by exploiting the learned POMDP model for belief inference. We apply these methods to the real-world problem of optimal maintenance planning for railway assets and compare the results with the current real-life policy. We show that the RL policy learned by the belief-input method is able to outperform the real-life policy by yielding significantly reduced life-cycle costs. 
    more » « less
  3. Public transits, such as buses and subway lines, offer affordable ride-sharing services and reduce the road network traffic, thus have significant impacts in mitigating the urban traffic congestion problem. However, it is non-trivial to evaluate a new transit plan, such as a new bus route or a new subway line, of its future ridership prior to actual deployment, since the travel preferences of passengers along the planned routes may vary. In this paper, we make the first attempt to model passengers' preferences of making various transit choices using a Markov Decision Process (MDP). Moreover, we develop a novel inverse preference learning algorithm to infer the passengers' preferences and predict the future human behavior changes, e.g., ridership, of a new urban transit plan before its deployment. We validate our proposed framework using a unique real-world dataset (from Shenzhen, China) with three subway lines opened during the data time span. With the data collected from both before and after the transit plan deployments, Our evaluation results demonstrated that the proposed framework can predict the ridership with only 19.8% relative error, which is 23%-51% lower than other baseline approaches. 
    more » « less
  4. AI-assisted decision-making systems hold immense potential to enhance human judgment, but their effectiveness is often hindered by a lack of understanding of the diverse ways in which humans take AI recommendations. Current research frequently relies on simplified, ``one-size-fits-all'' models to characterize an average human decision-maker, thus failing to capture the heterogeneity of people's decision-making behavior when incorporating AI assistance. To address this, we propose Mix and Match (M&M), a novel computational framework that explicitly models the diversity of human decision-makers and their unique patterns of relying on AI assistance. M&M represents the population of decision-makers as a mixture of distinct decision-making processes, with each process corresponding to a specific type of decision-maker. This approach enables us to infer latent behavioral patterns from limited data of human decisions under AI assistance, offering valuable insights into the cognitive processes underlying human-AI collaboration. Using real-world behavioral data, our empirical evaluation demonstrates that M&M consistently outperforms baseline methods in predicting human decision behavior. Furthermore, through a detailed analysis of the decision-maker types identified in our framework, we provide quantitative insights into nuanced patterns of how different individuals adopt AI recommendations. These findings offer implications for designing personalized and effective AI systems based on the diverse landscape of human behavior patterns in AI-assisted decision-making across various domains. 
    more » « less
  5. null (Ed.)
    Smart passenger-seeking strategies employed by taxi drivers contribute not only to drivers’ incomes, but also higher quality of service passengers received. Therefore, understanding taxi drivers’ behaviors and learning the good passenger-seeking strategies are crucial to boost taxi drivers’ well-being and public transportation quality of service. However, we observe that drivers’ preferences of choosing which area to find the next passenger are diverse and dynamic across locations and drivers. It is hard to learn the location-dependent preferences given the partial data (i.e., an individual driver's trajectory may not cover all locations). In this paper, we make the first attempt to develop conditional generative adversarial imitation learning (cGAIL) model, as a unifying collective inverse reinforcement learning framework that learns the driver's decision-making preferences and policies by transferring knowledge across taxi driver agents and across locations. Our evaluation results on three months of taxi GPS trajectory data in Shenzhen, China, demonstrate that the driver's preferences and policies learned from cGAIL are on average 34.7% more accurate than those learned from other state-of-the-art baseline approaches. 
    more » « less