NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Regret analysis of multi-task representation learning for linear-quadratic adaptive control

Lee, Bruce D; Toso, Leonardo F; Zhang, Thomas T; Anderson, James; Matni, Nikolai (February 2025, Proceedings of the AAAI Conference on Artificial Intelligence)

Free, publicly-accessible full text available February 1, 2026
Asynchronous Heterogeneous Linear Quadratic Regulator Design

https://doi.org/10.1109/CDC56724.2024.10885806

Toso, Leonardo F; Wang, Han; Anderson, James (December 2024, IEEE)

Free, publicly-accessible full text available December 16, 2025
Clustering Then Estimation of Spatio-Temporal Self-Exciting Processes

https://doi.org/10.1287/ijoc.2022.0351

Zhang, Haoting; Zhan, Donglin; Anderson, James; Righter, Rhonda; Zheng, Zeyu (September 2024, INFORMS Journal on Computing)

We propose a new estimation procedure for general spatio-temporal point processes that include a self-exciting feature. Estimating spatio-temporal self-exciting point processes with observed data is challenging, partly because of the difficulty in computing and optimizing the likelihood function. To circumvent this challenge, we employ a Poisson cluster representation for spatio-temporal self-exciting point processes to simplify the likelihood function and develop a new estimation procedure called “clustering-then-estimation” (CTE), which integrates clustering algorithms with likelihood-based estimation methods. Compared with the widely used expectation-maximization (EM) method, our approach separates the cluster structure inference of the data from the model selection. This has the benefit of reducing the risk of model misspecification. Our approach is computationally more efficient because it does not need to recursively solve optimization problems, which would be needed for EM. We also present asymptotic statistical results for our approach as theoretical support. Experimental results on several synthetic and real data sets illustrate the effectiveness of the proposed CTE procedure.
more » « less
Full Text Available
Oracle Complexity Reduction for Model-Free LQR: A Stochastic Variance-Reduced Policy Gradient Approach

https://doi.org/10.23919/ACC60939.2024.10644167

Toso, Leonardo F; Wang, Han; Anderson, James (July 2024, IEEE)

Full Text Available
Meta-learning linear quadratic regulators: a policy gradient MAML approach for model-free LQR

Toso, Leonardo; Zhan, Donglin; Anderson, James; Wang, Han (July 2024, Proceedings of Machine Learning Research (PMLR))
Abate, A; Cannon, M; Margellos, K; Papachristodoulou, A (Ed.)
We investigate the problem of learning linear quadratic regulators (LQR) in a multi-task, heterogeneous, and model-free setting. We characterize the stability and personalization guarantees of a policy gradient-based (PG) model-agnostic meta-learning (MAML) (Finn et al., 2017) approach for the LQR problem under different task-heterogeneity settings. We show that our MAML-LQR algorithm produces a stabilizing controller close to each task-specific optimal controller up to a task-heterogeneity bias in both model-based and model-free learning scenarios. Moreover, in the model-based setting, we show that such a controller is achieved with a linear convergence rate, which improves upon sub-linear rates from existing work. Our theoretical guarantees demonstrate that the learned controller can efficiently adapt to unseen LQR tasks.
more » « less
Full Text Available
Momentum for the Win: Collaborative Federated Reinforcement Learning across Heterogeneous Environments

Wang, Han; He, Shone; Zhang, Zhili; Miao, Fei; Anderson, James (May 2024, Proceedings of Machine Learning Research)

We explore a Federated Reinforcement Learning (FRL) problem where agents collaboratively learn a common policy without sharing their trajectory data. To date, existing FRL work has primarily focused on agents operating in the same or ``similar" environments. In contrast, our problem setup allows for arbitrarily large levels of environment heterogeneity. To obtain the optimal policy which maximizes the average performance across all potentially completely different environments, we propose two algorithms: FedSVRPG-M and FedHAPG-M. In contrast to existing results, we demonstrate that both FedSVRPG-M and FedHAPG-M, both of which leverage momentum mechanisms, can exactly converge to a stationary point of the average performance function, regardless of the magnitude of environment heterogeneity. Furthermore, by incorporating the benefits of variance-reduction techniques or Hessian approximation, both algorithms achieve state-of-the-art convergence results, characterized by a sample complexity of $$O(\epsilon^{-\frac{3}{2}}/N)$$. Notably, our algorithms enjoy linear convergence speedups with respect to the number of agents, highlighting the benefit of collaboration among agents in finding a common policy.
more » « less
Full Text Available
Data-Efficient and Robust Task Selection for Meta-Learning

Zhang, Donglin; Anderson, James (April 2024, The IEEE/CVF Conference on Computer Vision and Pattern Recognition 2024: The 7th Workshop on Efficient Deep Learning for Computer Vision (ECV))

Meta-learning methods typically learn tasks under the assumption that all tasks are equally important. However, this assumption is often not valid. In real-world applications, tasks can vary both in their importance during different train- ing stages and in whether they contain noisy labeled data or not, making a uniform approach suboptimal. To address these issues, we propose the Data-Efficient and Robust Task Selection (DERTS) algorithm, which can be incorporated into both gradient and metric-based meta-learning algo- rithms. DERTS selects weighted subsets of tasks from task pools by minimizing the approximation error of the full gra- dient of task pools in the meta-training stage. The selected tasks are efficient for rapid training and robust towards noisy label scenarios. Unlike existing algorithms, DERTS does not require any architecture modification for training and can handle noisy label data in both the support and query sets. Analysis of DERTS shows that the algorithm follows similar training dynamics as learning on the full task pools. Experiments show that DERTS outperforms existing sam- pling strategies for meta-learning on both gradient-based and metric-based meta-learning algorithms in limited data budget and noisy task settings.
more » « less
Full Text Available
Finite-Time Analysis of On-Policy Heterogeneous Federated Reinforcement Learning

Zhanh, Chenyu; Wang, Han; Mitra, Aritra; Anderson, James (January 2024, The Twelfth International Conference on Learning Representations (ICLR))

Federated reinforcement learning (FRL) has emerged as a promising paradigm for reducing the sample complexity of reinforcement learning tasks by exploiting information from different agents. However, when each agent interacts with a po- tentially different environment, little to nothing is known theoretically about the non-asymptotic performance of FRL algorithms. The lack of such results can be attributed to various technical challenges and their intricate interplay: Markovian sampling, linear function approximation, multiple local updates to save communi- cation, heterogeneity in the reward functions and transition kernels of the agents’ MDPs, and continuous state-action spaces. Moreover, in the on-policy setting, the behavior policies vary with time, further complicating the analysis. In response, we introduce FedSARSA, a novel federated on-policy reinforcement learning scheme, equipped with linear function approximation, to address these challenges and provide a comprehensive finite-time error analysis. Notably, we establish that FedSARSA converges to a policy that is near-optimal for all agents, with the ex- tent of near-optimality proportional to the level of heterogeneity. Furthermore, we prove that FedSARSA leverages agent collaboration to enable linear speedups as the number of agents increases, which holds for both fixed and adaptive step-size configurations.
more » « less
Full Text Available
Improved Communication Efficiency in Federated Natural Policy Gradient via ADMM-based Gradient Updates

Lan, Guangchen Lan; Wang, Han; Anderson, James; Brinton, Christopher; Aggarwal, Vaneet (December 2023, Neural Information Processing Systems)
Oh, A; Naumann, T; Globerson, A; Saenko, K; Hardt, M; Levine, S (Ed.)
Federated reinforcement learning (FedRL) enables agents to collaboratively train a global policy without sharing their individual data. However, high communication overhead remains a critical bottleneck, particularly for natural policy gradient (NPG) methods, which are second-order. To address this issue, we propose the FedNPG-ADMM framework, which leverages the alternating direction method of multipliers (ADMM) to approximate global NPG directions efficiently. We theoretically demonstrate that using ADMM-based gradient updates reduces communication complexity from $O(d^2)$ to $O(d)$ at each iteration, where $$d$$ is the number of model parameters. Furthermore, we show that achieving an $$\epsilon$$-error stationary convergence requires $$O(\frac{1}{(1-\gamma)^2-\epsilon})$$ iterations for discount factor $$\gamma$$, demonstrating that FedNPG-ADMM maintains the same convergence rate as standard FedNPG. Through evaluation of the proposed algorithms in MuJoCo environments, we demonstrate that FedNPG-ADMM maintains the reward performance of standard FedNPG, and that its convergence rate improves when the number of federated agents increases.
more » « less
Full Text Available
Model-free Learning with Heterogeneous Dynamical Systems: A Federated LQR Approach

Wang, H.; Toso, L.F.; Mitra, A.; Anderson, J (August 2023, arXivorg)

We study a model-free federated linear quadratic regulator (LQR) problem where M agents with unknown, distinct yet similar dynamics collaboratively learn an optimal policy to minimize an average quadratic cost while keeping their data private. To exploit the similarity of the agents' dynamics, we propose to use federated learning (FL) to allow the agents to periodically communicate with a central server to train policies by leveraging a larger dataset from all the agents. With this setup, we seek to understand the following questions: (i) Is the learned common policy stabilizing for all agents? (ii) How close is the learned common policy to each agent's own optimal policy? (iii) Can each agent learn its own optimal policy faster by leveraging data from all agents? To answer these questions, we propose a federated and model-free algorithm named FedLQR. Our analysis overcomes numerous technical challenges, such as heterogeneity in the agents' dynamics, multiple local updates, and stability concerns. We show that FedLQR produces a common policy that, at each iteration, is stabilizing for all agents. We provide bounds on the distance between the common policy and each agent's local optimal policy. Furthermore, we prove that when learning each agent's optimal policy, FedLQR achieves a sample complexity reduction proportional to the number of agents M in a low-heterogeneity regime, compared to the single-agent setting.
more » « less
Full Text Available

« Prev Next »

Search for: All records