NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

CUQDS: Conformal Uncertainty Quantification under Distribution Shift for Trajectory Prediction

Huang, Huiqun; He, Sihong; Miao, Fei (February 2025, Association for the Advancement of Artificial Intelligence, AAAI 2025)

Free, publicly-accessible full text available February 4, 2026
Policy Optimization for Robust Average Reward MDPs

Sun, Zhongchang; He, Sihong; Miao, Fei; Zou, Shaofeng (September 2024, Neurips 2024)

Full Text Available
Constrained Reinforcement Learning Under Model Mismatch

Sun, Zhongchang; He, Sihong; Miao, Fei; Zou, Shaofeng (May 2024, Proceedings of the 41 st International Conference on Machine Learning, ICML 2024)

Full Text Available
Momentum for theWin: Collaborative Federated Reinforcement Learning across Heterogeneous Environments

Wang, Han; He, Sihong; Zhang, Zhili; Anderson, James (May 2024, Proceedings of the 41 st International Conference on Machine Learning, ICML 2024)

Full Text Available
What is the Solution for State-Adversarial Multi-Agent Reinforcement Learning?

Han, Songyang; Su, Sanbao; He, Sihong; Han, Shuo; Yang, Haizhao; Zou, Shaofeng; Miao, Fei (February 2024, Transactions on machine learning research)

Various methods for Multi-Agent Reinforcement Learning (MARL) have been developed with the assumption that agents' policies are based on accurate state information. However, policies learned through Deep Reinforcement Learning (DRL) are susceptible to adversarial state perturbation attacks. In this work, we propose a State-Adversarial Markov Game (SAMG) and make the first attempt to investigate different solution concepts of MARL under state uncertainties. Our analysis shows that the commonly used solution concepts of optimal agent policy and robust Nash equilibrium do not always exist in SAMGs. To circumvent this difficulty, we consider a new solution concept called robust agent policy, where agents aim to maximize the worst-case expected state value. We prove the existence of robust agent policy for finite state and finite action SAMGs. Additionally, we propose a Robust Multi-Agent Adversarial Actor-Critic (RMA3C) algorithm to learn robust policies for MARL agents under state uncertainties. Our experiments demonstrate that our algorithm outperforms existing methods when faced with state perturbations and greatly improves the robustness of MARL policies. Our code is public on https://songyanghan.github.io/what_is_solution/.
more » « less
Full Text Available
A Multi-Agent Reinforcement Learning Approach for Safe and Efficient Behavior Planning of Connected Autonomous Vehicles

https://doi.org/10.1109/TITS.2023.3336670

Han, Songyang; Zhou, Shanglin; Wang, Jiangwei; Pepin, Lynn; Ding, Caiwen; Fu, Jie; Miao, Fei (December 2023, IEEE Transactions on Intelligent Transportation Systems)

Full Text Available
Surrogate Lagrangian Relaxation: A Path to Retrain-Free Deep Neural Network Pruning

https://doi.org/10.1145/3624476

Zhou, Shanglin; Bragin, Mikhail A.; Gurevin, Deniz; Pepin, Lynn; Miao, Fei; Ding, Caiwen (November 2023, ACM Transactions on Design Automation of Electronic Systems)

Network pruning is a widely used technique to reduce computation cost and model size for deep neural networks. However, the typical three-stage pipeline (i.e., training, pruning, and retraining (fine-tuning)) significantly increases the overall training time. In this article, we develop a systematic weight-pruning optimization approach based on surrogate Lagrangian relaxation (SLR), which is tailored to overcome difficulties caused by the discrete nature of the weight-pruning problem. We further prove that our method ensures fast convergence of the model compression problem, and the convergence of the SLR is accelerated by using quadratic penalties. Model parameters obtained by SLR during the training phase are much closer to their optimal values as compared to those obtained by other state-of-the-art methods. We evaluate our method on image classification tasks using CIFAR-10 and ImageNet with state-of-the-art multi-layer perceptron based networks such as MLP-Mixer; attention-based networks such as Swin Transformer; and convolutional neural network based models such as VGG-16, ResNet-18, ResNet-50, ResNet-110, and MobileNetV2. We also evaluate object detection and segmentation tasks on COCO, the KITTI benchmark, and the TuSimple lane detection dataset using a variety of models. Experimental results demonstrate that our SLR-based weight-pruning optimization approach achieves a higher compression rate than state-of-the-art methods under the same accuracy requirement and also can achieve higher accuracy under the same compression rate requirement. Under classification tasks, our SLR approach converges to the desired accuracy × faster on both of the datasets. Under object detection and segmentation tasks, SLR also converges 2× faster to the desired accuracy. Further, our SLR achieves high model accuracy even at the hardpruning stage without retraining, which reduces the traditional three-stage pruning into a two-stage process. Given a limited budget of retraining epochs, our approach quickly recovers the model’s accuracy.
more » « less
Full Text Available
Towards Safe Autonomy in Hybrid Traffic: Detecting Unpredictable Abnormal Behaviors of Human Drivers via Information Sharing

https://doi.org/10.1145/3616398

Wang, Jiangwei; Su, Lili; Han, Songyang; Song, Dongjin; Miao, Fei (September 2023, ACM Transactions on Cyber-Physical Systems)

Hybrid traffic which involves both autonomous and human-driven vehicles would be the norm of the autonomous vehicles’ practice for a while. On the one hand, unlike autonomous vehicles, human-driven vehicles could exhibit sudden abnormal behaviors such as unpredictably switching to dangerous driving modes – putting its neighboring vehicles under risks; such undesired mode switching could arise from numbers of human driver factors, including fatigue, drunkenness, distraction, aggressiveness, etc. On the other hand, modern vehicle-to-vehicle (V2V) communication technologies enable the autonomous vehicles to efficiently and reliably share the scarce run-time information with each other [1]. In this paper, we propose, to the best of our knowledge, the first efficient algorithm that can (1) significantly improve trajectory prediction by effectively fusing the run-time information shared by surrounding autonomous vehicles, and can (2) accurately and quickly detect abnormal human driving mode switches or abnormal driving behavior with formal assurance without hurting human drivers’ privacy. To validate our proposed algorithm, we first evaluate our proposed trajectory predictor on NGSIM and Argoverse datasets and show that our proposed predictor outperforms the baseline methods. Then through extensive experiments on SUMO simulator, we show that our proposed algorithm has great detection performance in both highway and urban traffic. The best performance achieves detection rate of\(97.3\% \), average detection delay of 1.2s, and 0 false alarm.
more » « less
Full Text Available
Robust Multi-Agent Reinforcement Learning with Adversarial State Uncertainties

He, Sihong; Han, Songyang; Su, Sanbao; Han, Shuo; Zou, Shaofeng; Miao, Fei. (June 2023, Transactions on Machine Learning Research)

Full Text Available
Spatial-Temporal-Aware Safe Multi-Agent Reinforcement Learning of Connected Autonomous Vehicles in Challenging Scenarios

https://doi.org/10.1109/ICRA48891.2023.10161216

Zhang, Zhili; Han, Songyang; Wang, Jiangwei; Miao, Fei (May 2023, IEEE)

Full Text Available

« Prev Next »

Search for: All records