NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Toward Heterogeneous Graph-based Imitation Learning for Autonomous Driving Simulation: Interaction Awareness and Hierarchical Explainability

https://doi.org/10.1145/3708354

Tabatabaie, Mahan; He, Suining; Shin, Kang; Wang, Hao (December 2024, ACM Journal on Autonomous Transportation Systems)

Understanding and learning the actor-to-X interactions (AXIs), such as those between the focal vehicles (actor) and other traffic participants, such as other vehicles and pedestrians, as well as traffic environments like the city or road map, is essential for the development of a decision-making model and the simulation of autonomous driving. Existing practices on imitation learning (IL) for autonomous driving simulation, despite the advances in the model learnability, have not accounted for fusing and differentiating the heterogeneous AXIs in complex road environments. Furthermore, how to further explain the hierarchical structures within the complex AXIs remains largely under-explored. To meet these challenges, we proposeHGIL, an interaction-aware and hierarchically-explainableHeterogeneousGraph-basedImitationLearning approach for autonomous driving simulation. We have designed a novel heterogeneous interaction graph (HIG) to provide local and global representation as well as awareness of the AXIs. Integrating the HIG as the state embeddings, we have designed a hierarchically-explainable generative adversarial imitation learning approach, with local sub-graph and global cross-graph attention, to capture the interaction behaviors and driving decision-making processes. Our data-driven simulation and explanation studies based on the Argoverse v2 dataset (with a total of 40,000 driving scenes) have corroborated the accuracy (e.g., lower displacement errors compared to the state-of-the-art (SOTA) approaches) and explainability ofHGILin learning and capturing the complex AXIs.
more » « less
Free, publicly-accessible full text available December 12, 2025
Practical Self-Supervised Contrastive Driver Maneuver Interaction Learning via Augmenting Inertial Measurement Unit Signals

https://doi.org/10.1109/SWC62898.2024.00154

Deng, Yawen; He, Suining; Wang, Hao (December 2024, IEEE)

Free, publicly-accessible full text available December 2, 2025
Beyond "Taming Electric Scooters": Disentangling Understandings of Micromobility Naturalistic Riding

https://doi.org/10.1145/3678513

Tabatabaie, Mahan; He, Suining; Wang, Hao; Shin, Kang G (August 2024, Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies)

Electric(e)-scooters have emerged as a popular, ubiquitous, and first/last-mile micromobility transportation option within and across many cities worldwide. With the increasing situation-awareness and on-board computational capability, such intelligent micromobility has become a critical means of understanding the rider's interactions with other traffic constituents (called Rider-to-X Interactions, RXIs), such as pedestrians, cars, and other micromobility vehicles, as well as road environments, including curbs, road infrastructures, and traffic signs. How to interpret these complex, dynamic, and context-dependent RXIs, particularly for the rider-centric understandings across different data modalities --- such as visual, behavioral, and textual data --- is essential for enabling safer and more comfortable micromobility riding experience and the greater good of urban transportation networks. Under a naturalistic riding setting (i.e., without any unnatural constraint on rider's decision-making and maneuvering), we have designed, implemented, and evaluated a pilot Cross-modality E-scooter Naturalistic Riding Understanding System, namely CENRUS, from a human-centered AI perspective. We have conducted an extensive study with CENRUS in sensing, analyzing, and understanding the behavioral, visual, and textual annotation data of RXIs during naturalistic riding. We have also designed a novel, efficient, and usable disentanglement mechanism to conceptualize and understand the e-scooter naturalistic riding processes, and conducted extensive human-centered AI model studies. We have performed multiple downstream tasks enabled by the core model within CENRUS to derive the human-centered AI understandings and insights of complex RXIs, showcasing such downstream tasks as efficient information retrieval and scene understanding. CENRUS can serve as a foundational system for safe and easy-to-use micromobility rider assistance as well as accountable use of micromobility vehicles.
more » « less
Full Text Available
Driver Maneuver Interaction Identification with Anomaly-Aware Federated Learning on Heterogeneous Feature Representations

https://doi.org/10.1145/3631421

Tabatabaie, Mahan; He, Suining (December 2023, Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies)

Driver maneuver interaction learning (DMIL) refers to the classification task with the goal of identifying different driver-vehicle maneuver interactions (e.g., left/right turns). Existing conventional studies largely focused on the centralized collection of sensor data from the drivers' smartphones (say, inertial measurement units or IMUs, like accelerometer and gyroscope). Such a centralized mechanism might be precluded by data regulatory constraints. Furthermore, how to enable an adaptive and accurate DMIL framework remains challenging due to (i) complexity in heterogeneous driver maneuver patterns, and (ii) impacts of anomalous driver maneuvers due to, for instance, aggressive driving styles and behaviors. To overcome the above challenges, we propose AF-DMIL, an Anomaly-aware Federated Driver Maneuver Interaction Learning system. We focus on the real-world IMU sensor datasets (e.g., collected by smartphones) for our pilot case study. In particular, we have designed three heterogeneous representations for AF-DMIL regarding spectral, time series, and statistical features that are derived from the IMU sensor readings. We have designed a novel heterogeneous representation attention network (HetRANet) based on spectral channel attention, temporal sequence attention, and statistical feature learning mechanisms, jointly capturing and identifying the complex patterns within driver maneuver behaviors. Furthermore, we have designed a densely-connected convolutional neural network in HetRANet to enable the complex feature extraction and enhance the computational efficiency of HetRANet. In addition, we have designed within AF-DMIL a novel anomaly-aware federated learning approach for decentralized DMIL in response to anomalous maneuver data. To ease extraction of the maneuver patterns and evaluation of their mutual differences, we have designed an embedding projection network that projects the high-dimensional driver maneuver features into low-dimensional space, and further derives the exemplars that represent the driver maneuver patterns for mutual comparison. Then, AF-DMIL further leverages the mutual differences of the exemplars to identify those that exhibit anomalous patterns and deviate from others, and mitigates their impacts upon the federated DMIL. We have conducted extensive driver data analytics and experimental studies on three real-world datasets (one is harvested on our own) to evaluate the prototype of AF-DMIL, demonstrating AF-DMIL's accuracy and effectiveness compared to the state-of-the-art DMIL baselines (on average by more than 13% improvement in terms of DMIL accuracy), as well as fewer communication rounds (on average 29.20% fewer than existing distributed learning mechanisms).
more » « less
Full Text Available
Interaction-Aware and Hierarchically-Explainable Heterogeneous Graph-based Imitation Learning for Autonomous Driving Simulation

https://doi.org/10.1109/IROS55552.2023.10342051

Tabatabaie, Mahan; He, Suining; Shin, Kang G. (October 2023, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS))

Understanding and learning the actor-to-X inter-actions (AXIs), such as those between the focal vehicles (actor) and other traffic participants (e.g., other vehicles, pedestrians) as well as traffic environments (e.g., city/road map), is essential for the development of a decision-making model and simulation of autonomous driving (AD). Existing practices on imitation learning (IL) for AD simulation, despite the advances in the model learnability, have not accounted for fusing and differentiating the heterogeneous AXIs in complex road environments. Furthermore, how to further explain the hierarchical structures within the complex AXIs remains largely under-explored. To overcome these challenges, we propose HGIL, an interaction- aware and hierarchically-explainable Heterogeneous _Graph- based Imitation Learning approach for AD simulation. We have designed a novel heterogeneous interaction graph (HIG) to provide local and global representation as well as awareness of the AXIs. Integrating the HIG as the state embeddings, we have designed a hierarchically-explainable generative adversarial imitation learning approach, with local sub-graph and global cross-graph attention, to capture the interaction behaviors and driving decision-making processes. Our data-driven simulation and explanation studies have corroborated the accuracy and explainability of HGIL in learning and capturing the complex AXIs.
more » « less
Full Text Available
Cross-Modality Graph-based Language and Sensor Data Co-Learning of Human-Mobility Interaction

https://doi.org/10.1145/3610904

Tabatabaie, Mahan; He, Suining; Shin, Kang G. (September 2023, Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies)

Learning the human--mobility interaction (HMI) on interactive scenes (e.g., how a vehicle turns at an intersection in response to traffic lights and other oncoming vehicles) can enhance the safety, efficiency, and resilience of smart mobility systems (e.g., autonomous vehicles) and many other ubiquitous computing applications. Towards the ubiquitous and understandable HMI learning, this paper considers both spoken language (e.g., human textual annotations) and unspoken language (e.g., visual and sensor-based behavioral mobility information related to the HMI scenes) in terms of information modalities from the real-world HMI scenarios. We aim to extract the important but possibly implicit HMI concepts (as the named entities) from the textual annotations (provided by human annotators) through a novel human language and sensor data co-learning design. To this end, we propose CG-HMI, a novel Cross-modality Graph fusion approach for extracting important Human-Mobility Interaction concepts from co-learning of textual annotations as well as the visual and behavioral sensor data. In order to fuse both unspoken and spoken languages, we have designed a unified representation called the human--mobility interaction graph (HMIG) for each modality related to the HMI scenes, i.e., textual annotations, visual video frames, and behavioral sensor time-series (e.g., from the on-board or smartphone inertial measurement units). The nodes of the HMIG in these modalities correspond to the textual words (tokenized for ease of processing) related to HMI concepts, the detected traffic participant/environment categories, and the vehicle maneuver behavior types determined from the behavioral sensor time-series. To extract the inter- and intra-modality semantic correspondences and interactions in the HMIG, we have designed a novel graph interaction fusion approach with differentiable pooling-based graph attention. The resulting graph embeddings are then processed to identify and retrieve the HMI concepts within the annotations, which can benefit the downstream human-computer interaction and ubiquitous computing applications. We have developed and implemented CG-HMI into a system prototype, and performed extensive studies upon three real-world HMI datasets (two on car driving and the third one on e-scooter riding). We have corroborated the excellent performance (on average 13.11% higher accuracy than the other baselines in terms of precision, recall, and F1 measure) and effectiveness of CG-HMI in recognizing and extracting the important HMI concepts through cross-modality learning. Our CG-HMI studies also provide real-world implications (e.g., road safety and driving behaviors) about the interactions between the drivers and other traffic participants.
more » « less
Full Text Available

Search for: All records