NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Concentration of contractive stochastic approximation: Additive and multiplicative noise

https://doi.org/10.1214/24-AAP2143

Chen, Zaiwei; Maguluri, Siva Theja; Zubeldia, Martin (April 2025, The Annals of Applied Probability)

Free, publicly-accessible full text available April 1, 2026
Last-Iterate Convergence for Generalized Frank-Wolfe in Monotone Variational Inequalities

Chen, Zaiwei; Mazumdar, Eric (September 2024, The Thirty-eighth Annual Conference on Neural Information Processing Systems)

Full Text Available
Two-Timescale Q-Learning with Function Approximation in Zero-Sum Stochastic Games

https://doi.org/10.1145/3670865.3673491

Chen, Zaiwei; Zhang, Kaiqing; Mazumdar, Eric; Ozdaglar, Asuman; Wierman, Adam (July 2024, ACM)

Full Text Available
Target Network and Truncation Overcome the Deadly Triad in \(\boldsymbol{Q}\)-Learning

https://doi.org/10.1137/22M1499261

Chen, Zaiwei; Clarke, John-Paul; Maguluri, Siva Theja (December 2023, SIAM Journal on Mathematics of Data Science)

Full Text Available
A Lyapunov Theory for Finite-Sample Guarantees of Markovian Stochastic Approximation

https://doi.org/10.1287/opre.2022.0249

Chen, Zaiwei; Maguluri, Siva T; Shakkottai, Sanjay; Shanmugam, Karthikeyan (October 2023, Operations Research)

This paper develops a unified Lyapunov framework for finite-sample analysis of a Markovian stochastic approximation (SA) algorithm under a contraction operator with respect to an arbitrary norm. The main novelty lies in the construction of a valid Lyapunov function called the generalized Moreau envelope. The smoothness and an approximation property of the generalized Moreau envelope enable us to derive a one-step Lyapunov drift inequality, which is the key to establishing the finite-sample bounds. Our SA result has wide applications, especially in the context of reinforcement learning (RL). Specifically, we show that a large class of value-based RL algorithms can be modeled in the exact form of our Markovian SA algorithm. Therefore, our SA results immediately imply finite-sample guarantees for popular RL algorithms such as n-step temporal difference (TD) learning, TD(𝜆), off-policy V-trace, and Q-learning. As byproducts, by analyzing the convergence bounds of n-step TD and TD(𝜆), we provide theoretical insight into the problem about the efficiency of bootstrapping. Moreover, our finite-sample bounds of off-policy V-trace explicitly capture the tradeoff between the variance of the stochastic iterates and the bias in the limit.
more » « less
Full Text Available
A Finite-Sample Analysis of Payoff-Based Independent Learning in Zero-Sum Stochastic Games

Chen, Zaiwei; Zhang, Kaiqing; Mazumdar, Eric; Ozdaglar, Asuman; Wierman, Adam (September 2023, Advances in neural information processing systems)
Finite-sample analysis of nonlinear stochastic approximation with applications in reinforcement learning

https://doi.org/10.1016/j.automatica.2022.110623

Chen, Zaiwei; Zhang, Sheng; Doan, Thinh T.; Clarke, John-Paul; Maguluri, Siva Theja (December 2022, Automatica)

Full Text Available

Search for: All records