Understanding and learning the actor-to-X interactions (AXIs), such as those between the focal vehicles (actor) and other traffic participants, such as other vehicles and pedestrians, as well as traffic environments like the city or road map, is essential for the development of a decision-making model and the simulation of autonomous driving. Existing practices on imitation learning (IL) for autonomous driving simulation, despite the advances in the model learnability, have not accounted for fusing and differentiating the heterogeneous AXIs in complex road environments. Furthermore, how to further explain the hierarchical structures within the complex AXIs remains largely under-explored. To meet these challenges, we proposeHGIL, an interaction-aware and hierarchically-explainableHeterogeneousGraph-basedImitationLearning approach for autonomous driving simulation. We have designed a novel heterogeneous interaction graph (HIG) to provide local and global representation as well as awareness of the AXIs. Integrating the HIG as the state embeddings, we have designed a hierarchically-explainable generative adversarial imitation learning approach, with local sub-graph and global cross-graph attention, to capture the interaction behaviors and driving decision-making processes. Our data-driven simulation and explanation studies based on the Argoverse v2 dataset (with a total of 40,000 driving scenes) have corroborated the accuracy (e.g., lower displacement errors compared to the state-of-the-art (SOTA) approaches) and explainability ofHGILin learning and capturing the complex AXIs. 
                        more » 
                        « less   
                    
                            
                            Interaction-Aware and Hierarchically-Explainable Heterogeneous Graph-based Imitation Learning for Autonomous Driving Simulation
                        
                    
    
            Understanding and learning the actor-to-X inter-actions (AXIs), such as those between the focal vehicles (actor) and other traffic participants (e.g., other vehicles, pedestrians) as well as traffic environments (e.g., city/road map), is essential for the development of a decision-making model and simulation of autonomous driving (AD). Existing practices on imitation learning (IL) for AD simulation, despite the advances in the model learnability, have not accounted for fusing and differentiating the heterogeneous AXIs in complex road environments. Furthermore, how to further explain the hierarchical structures within the complex AXIs remains largely under-explored. To overcome these challenges, we propose HGIL, an interaction- aware and hierarchically-explainable Heterogeneous _Graph- based Imitation Learning approach for AD simulation. We have designed a novel heterogeneous interaction graph (HIG) to provide local and global representation as well as awareness of the AXIs. Integrating the HIG as the state embeddings, we have designed a hierarchically-explainable generative adversarial imitation learning approach, with local sub-graph and global cross-graph attention, to capture the interaction behaviors and driving decision-making processes. Our data-driven simulation and explanation studies have corroborated the accuracy and explainability of HGIL in learning and capturing the complex AXIs. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 2239897
- PAR ID:
- 10500772
- Publisher / Repository:
- IEEE
- Date Published:
- Journal Name:
- IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
- ISBN:
- 978-1-6654-9190-7
- Page Range / eLocation ID:
- 3576 to 3581
- Format(s):
- Medium: X
- Location:
- Detroit, MI, USA
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            Objective This study investigated drivers’ subjective feelings and decision making in mixed traffic by quantifying driver’s driving style and type of interaction. Background Human-driven vehicles (HVs) will share the road with automated vehicles (AVs) in mixed traffic. Previous studies focused on simulating the impacts of AVs on traffic flow, investigating car-following situations, and using simulation analysis lacking experimental tests of human drivers. Method Thirty-six drivers were classified into three driver groups (aggressive, moderate, and defensive drivers) and experienced HV-AV interaction and HV-HV interaction in a supervised web-based experiment. Drivers’ subjective feelings and decision making were collected via questionnaires. Results Results revealed that aggressive and moderate drivers felt significantly more anxious, less comfortable, and were more likely to behave aggressively in HV-AV interaction than in HV-HV interaction. Aggressive drivers were also more likely to take advantage of AVs on the road. In contrast, no such differences were found for defensive drivers indicating they were not significantly influenced by the type of vehicles with which they were interacting. Conclusion Driving style and type of interaction significantly influenced drivers’ subjective feelings and decision making in mixed traffic. This study brought insights into how human drivers perceive and interact with AVs and HVs on the road and how human drivers take advantage of AVs. Application This study provided a foundation for developing guidelines for mixed transportation systems to improve driver safety and user experience.more » « less
- 
            Learning the human--mobility interaction (HMI) on interactive scenes (e.g., how a vehicle turns at an intersection in response to traffic lights and other oncoming vehicles) can enhance the safety, efficiency, and resilience of smart mobility systems (e.g., autonomous vehicles) and many other ubiquitous computing applications. Towards the ubiquitous and understandable HMI learning, this paper considers both spoken language (e.g., human textual annotations) and unspoken language (e.g., visual and sensor-based behavioral mobility information related to the HMI scenes) in terms of information modalities from the real-world HMI scenarios. We aim to extract the important but possibly implicit HMI concepts (as the named entities) from the textual annotations (provided by human annotators) through a novel human language and sensor data co-learning design. To this end, we propose CG-HMI, a novel Cross-modality Graph fusion approach for extracting important Human-Mobility Interaction concepts from co-learning of textual annotations as well as the visual and behavioral sensor data. In order to fuse both unspoken and spoken languages, we have designed a unified representation called the human--mobility interaction graph (HMIG) for each modality related to the HMI scenes, i.e., textual annotations, visual video frames, and behavioral sensor time-series (e.g., from the on-board or smartphone inertial measurement units). The nodes of the HMIG in these modalities correspond to the textual words (tokenized for ease of processing) related to HMI concepts, the detected traffic participant/environment categories, and the vehicle maneuver behavior types determined from the behavioral sensor time-series. To extract the inter- and intra-modality semantic correspondences and interactions in the HMIG, we have designed a novel graph interaction fusion approach with differentiable pooling-based graph attention. The resulting graph embeddings are then processed to identify and retrieve the HMI concepts within the annotations, which can benefit the downstream human-computer interaction and ubiquitous computing applications. We have developed and implemented CG-HMI into a system prototype, and performed extensive studies upon three real-world HMI datasets (two on car driving and the third one on e-scooter riding). We have corroborated the excellent performance (on average 13.11% higher accuracy than the other baselines in terms of precision, recall, and F1 measure) and effectiveness of CG-HMI in recognizing and extracting the important HMI concepts through cross-modality learning. Our CG-HMI studies also provide real-world implications (e.g., road safety and driving behaviors) about the interactions between the drivers and other traffic participants.more » « less
- 
            xGAIL: Explainable Generative Adversarial Imitation Learning for Explainable Human Decision AnalysisTo make daily decisions, human agents devise their own "strategies" governing their mobility dynamics (e.g., taxi drivers have preferred working regions and times, and urban commuters have preferred routes and transit modes). Recent research such as generative adversarial imitation learning (GAIL) demonstrates successes in learning human decision-making strategies from their behavior data using deep neural networks (DNNs), which can accurately mimic how humans behave in various scenarios, e.g., playing video games, etc. However, such DNN-based models are "black box" models in nature, making it hard to explain what knowledge the models have learned from human, and how the models make such decisions, which was not addressed in the literature of imitation learning. This paper addresses this research gap by proposing xGAIL, the first explainable generative adversarial imitation learning framework. The proposed xGAIL framework consists of two novel components, including Spatial Activation Maximization (SpatialAM) and Spatial Randomized Input Sampling Explanation (SpatialRISE), to extract both global and local knowledge from a well-trained GAIL model that explains how a human agent makes decisions. Especially, we take taxi drivers' passenger-seeking strategy as an example to validate the effectiveness of the proposed xGAIL framework. Our analysis on a large-scale real-world taxi trajectory data shows promising results from two aspects: i) global explainable knowledge of what nearby traffic condition impels a taxi driver to choose a particular direction to find the next passenger, and ii) local explainable knowledge of what key (sometimes hidden) factors a taxi driver considers when making a particular decision.more » « less
- 
            Candes, Emmanuel; Ma, Yi (Ed.)The past few years have witnessed a rapid growth of the deployment of automated vehicles (AVs). Clearly, AVs and human-driven vehicles (HVs) will co-exist for many years, and AVs will have to operate around HVs, pedestrians, cyclists, and more, calling for fundamental breakthroughs in AI designed for mixed traffic to achieve mixed autonomy. Thus motivated, we study heterogeneous decision making by AVs and HVs in a mixed traffic environment, aiming to capture the interactions between human and machine decision-making and develop an AI foundation that enables vehicles to operate safely and efficiently. There are a number of challenges to achieve mixed autonomy, including 1) humans drivers make driving decisions with bounded rationality, and it remains open to develop accurate models for HVs' decision making; and 2) uncertainty-aware planning plays a critical role for AVs to take safety maneuvers in response to the human behavior. In this paper, we introduce a formulation of AV-HV interaction, where the HV makes decisions with bounded rationality and the AV employs uncertainty-aware planning based on the prediction on HV's future actions. We conduct a comprehensive analysis on AV and HV's learning regret to answer the questions: 1) How does the learning performance depend on HV's bounded rationality and AV's planning; 2) How do different decision making strategies impact the overall learning performance? Our findings reveal some intriguing phenomena, such as Goodhart's Law in AV's learning performance and compounding effects in HV's decision making process. By examining the dynamics of the regrets, we gain insights into the interplay between human and machine decision making.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    