NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

AdaWM: Adaptive World Model based Planning for Autonomous Driving

Wang, Hang; Ye, Xin; Tao, Feng; Pan, Chenbin; Mallik, Abhirup; Yaman, Burhaneddin; Ren, Liu Ren; Zhang, Junshan (April 2025, https://iclr.cc/)

World model based reinforcement learning (RL) has emerged as a promising approach for autonomous driving, which learns a latent dynamics model and uses it to train a planning policy. To speed up the learning process, the pretrain-finetune paradigm is often used, where online RL is initialized by a pretrained model and a policy learned offline. However, naively performing such initialization in RL may result in dramatic performance degradation during the online interactions in the new task. To tackle this challenge, we first analyze the performance degradation and identify two primary root causes therein: the mismatch of the planning policy and the mismatch of the dynamics model, due to distribution shift. We further analyze the effects of these factors on performance degradation during finetuning, and our findings reveal that the choice of finetuning strategies plays a pivotal role in mitigating these effects. We then introduce AdaWM, an Adaptive World Model based planning method, featuring two key steps: (a) mismatch identification, which quantifies the mismatches and informs the finetuning strategy, and (b) alignment-driven finetuning, which selectively updates either the policy or the model as needed using efficient low-rank updates. Extensive experiments on the challenging CARLA driving tasks demonstrate that AdaWM significantly improves the finetuning process, resulting in more robust and efficient .
more » « less
Free, publicly-accessible full text available April 24, 2026
CarDreamer: Open-Source Learning Platform for World-Model-Based Autonomous Driving

https://doi.org/10.1109/JIOT.2024.3479088

Gao, Dechen; Cai, Shuangyu; Zhou, Hanchu; Wang, Hang; Soltani, Iman; Zhang, Junshan (February 2025, IEEE Internet of Things Journal)

Free, publicly-accessible full text available February 1, 2026
Communication-Efficient Training Workload Balancing for Decentralized Multi-Agent Learning

Mohammadabadi, S; Yang, L; Yan, F; Zhang, J (July 2024, IEEE)

Full Text Available
Towards Resource-Efficient Edge AI: From Federated Learning to Semi-supervised Model Personalization

Zhang, Zhaofeng; Yue, Sheng; Zhang, Junshan (May 2024, IEEE transactions on mobile computing)

Full Text Available
Towards Resource-Efficient Edge AI: From Federated Learning to Semi-Supervised Model Personalization

https://doi.org/10.1109/TMC.2023.3316189

Zhang, Zhaofeng; Yue, Sheng; Zhang, Junshan (May 2024, IEEE Transactions on Mobile Computing)

Full Text Available
Collaboration in Participant-Centric Federated Learning: A Game-Theoretical Perspective

https://doi.org/10.1109/TMC.2022.3194198

Huang, Guangjing; Chen, Xu; Ouyang, Tao; Ma, Qian; Chen, Lin; Zhang, Junshan (November 2023, IEEE Transactions on Mobile Computing)

Full Text Available
HiFlash: Communication-Efficient Hierarchical Federated Learning With Adaptive Staleness Control and Heterogeneity-Aware Client-Edge Association

https://doi.org/10.1109/TPDS.2023.3238049

Wu, Qiong; Chen, Xu; Ouyang, Tao; Zhou, Zhi; Zhang, Xiaoxi; Yang, Shusen; Zhang, Junshan (May 2023, IEEE Transactions on Parallel and Distributed Systems)

Full Text Available
Communication-Efficient Distributed Learning: An Overview

https://doi.org/10.1109/JSAC.2023.3242710

Cao, Xuanyu; Başar, Tamer; Diggavi, Suhas; Eldar, Yonina C.; Letaief, Khaled B.; Poor, H. Vincent; Zhang, Junshan (April 2023, IEEE Journal on Selected Areas in Communications)

Full Text Available
TRGP: Trust Region Gradient Projection for Continual Learning

Lin, Sen; Yang, Li; Fan, Deliang; Zhang, Junshan (April 2022, The Tenth International Conference on Learning Representations)

Catastrophic forgetting is one of the major challenges in continual learning. To address this issue, some existing methods put restrictive constraints on the optimization space of the new task for minimizing the interference to old tasks. However, this may lead to unsatisfactory performance for the new task, especially when the new task is strongly correlated with old tasks. To tackle this challenge, we propose Trust Region Gradient Projection (TRGP) for continual learning to facilitate the forward knowledge transfer based on an efficient characterization of task correlation. Particularly, we introduce a notion of 'trust region' to select the most related old tasks for the new task in a layer-wise and single-shot manner, using the norm of gradient projection onto the subspace spanned by task inputs. Then, a scaled weight projection is proposed to cleverly reuse the frozen weights of the selected old tasks in the trust region through a layer-wise scaling matrix. By jointly optimizing the scaling matrices and the model, where the model is updated along the directions orthogonal to the subspaces of old tasks, TRGP can effectively prompt knowledge transfer without forgetting. Extensive experiments show that our approach achieves significant improvement over related state-of-the-art methods.
more » « less
Full Text Available
Model-Based Offline Meta-Reinforcement Learning with Regularization

Lin, Sen; Wan, Jialin; Xu, Tengyu; Liang, Yingbin; Zhang. Junshan (April 2022, The Tenth International Conference on Learning Representations)

Existing offline reinforcement learning (RL) methods face a few major challenges, particularly the distributional shift between the learned policy and the behavior policy. Offline Meta-RL is emerging as a promising approach to address these challenges, aiming to learn an informative meta-policy from a collection of tasks. Nevertheless, as shown in our empirical studies, offline Meta-RL could be outperformed by offline single-task RL methods on tasks with good quality of datasets, indicating that a right balance has to be delicately calibrated between "exploring" the out-of-distribution state-actions by following the meta-policy and "exploiting" the offline dataset by staying close to the behavior policy. Motivated by such empirical analysis, we propose model-based offline ta-RL with regularized policy optimization (MerPO), which learns a meta-model for efficient task structure inference and an informative meta-policy for safe exploration of out-of-distribution state-actions. In particular, we devise a new meta-Regularized model-based Actor-Critic (RAC) method for within-task policy optimization, as a key building block of MerPO, using both conservative policy evaluation and regularized policy improvement; and the intrinsic tradeoff therein is achieved via striking the right balance between two regularizers, one based on the behavior policy and the other on the meta-policy. We theoretically show that the learnt policy offers guaranteed improvement over both the behavior policy and the meta-policy, thus ensuring the performance improvement on new tasks via offline Meta-RL. Our experiments corroborate the superior performance of MerPO over existing offline Meta-RL methods.
more » « less
Full Text Available

« Prev Next »

Search for: All records