NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Hybrid reinforcement learning breaks sample size barriers in linear MDPs

Tan, Kevin; Fan, Wei; Wei, Yuting (December 2024, Neural Information Processing Systems)

Full Text Available
Accelerating convergence of score-based diffusion models, provably

Li, Gen; Huang, Yu; Efimov, Timofey; Wei, Yuting; Chi, Yuejie; Chen, Yuxin (July 2024, International Conference on Machine Learning)
Theoretical insights for diffusion guidance: A case study for Gaussian mixture models

Wu, Yuchen; Chen, Minshuo; Li, Zihao; Wang, Mengdi; Wei, Yuting (July 2024, International Conference on Machine Learning)
Towards Non-Asymptotic Convergence for Diffusion-Based Generative Models

Li, Gen; Wei, Yuting; Chen, Yuxin; Chi, Yuejie (May 2024, The Twelfth International Conference on Learning Representations)

Full Text Available
Settling the sample complexity of model-based offline reinforcement learning

https://doi.org/10.1214/23-AOS2342

Li, Gen; Shi, Laixi; Chen, Yuxin; Chi, Yuejie; Wei, Yuting (February 2024, The Annals of Statistics)

Full Text Available
Breaking the Sample Size Barrier in Model-Based Reinforcement Learning with a Generative Model

https://doi.org/10.1287/opre.2023.2451

Li, Gen; Wei, Yuting; Chi, Yuejie; Chen, Yuxin (January 2024, Operations Research)

This paper studies a central issue in modern reinforcement learning, the sample efficiency, and makes progress toward solving an idealistic scenario that assumes access to a generative model or a simulator. Despite a large number of prior works tackling this problem, a complete picture of the trade-offs between sample complexity and statistical accuracy has yet to be determined. In particular, all prior results suffer from a severe sample size barrier in the sense that their claimed statistical guarantees hold only when the sample size exceeds some enormous threshold. The current paper overcomes this barrier and fully settles this problem; more specifically, we establish the minimax optimality of the model-based approach for any given target accuracy level. To the best of our knowledge, this work delivers the first minimax-optimal guarantees that accommodate the entire range of sample sizes (beyond which finding a meaningful policy is information theoretically infeasible).
more » « less
Full Text Available
Is Q-Learning Minimax Optimal? A Tight Sample Complexity Analysis

https://doi.org/10.1287/opre.2023.2450

Li, Gen; Cai, Changxiao; Chen, Yuxin; Wei, Yuting; Chi, Yuejie (January 2024, Operations Research)

This paper investigates a model-free algorithm of broad interest in reinforcement learning, namely, Q-learning. Whereas substantial progress had been made toward understanding the sample efficiency of Q-learning in recent years, it remained largely unclear whether Q-learning is sample-optimal and how to sharpen the sample complexity analysis of Q-learning. In this paper, we settle these questions: (1) When there is only a single action, we show that Q-learning (or, equivalently, TD learning) is provably minimax optimal. (2) When there are at least two actions, our theory unveils the strict suboptimality of Q-learning and rigorizes the negative impact of overestimation in Q-learning. Our theory accommodates both the synchronous case (i.e., the case in which independent samples are drawn) and the asynchronous case (i.e., the case in which one only has access to a single Markovian trajectory).
more » « less
Full Text Available
High-probability sample complexities for policy evaluation with linear function approximation

https://doi.org/10.1109/TIT.2024.3394685

Li, Gen; Wu, Weichen; Chi, Yuejie; Ma, Cong; Rinaldo, Alessandro; Wei, Yuting (January 2024, IEEE Transactions on Information Theory)

Full Text Available
The Lasso with general Gaussian designs with applications to hypothesis testing

https://doi.org/10.1214/23-AOS2327

Celentano, Michael; Montanari, Andrea; Wei, Yuting (October 2023, The Annals of Statistics)

Full Text Available
Approximate message passing from random initialization with applications to Z ₂ synchronization

https://doi.org/10.1073/pnas.2302930120

Li, Gen; Fan, Wei; Wei, Yuting (August 2023, Proceedings of the National Academy of Sciences)

This paper is concerned with the problem of reconstructing an unknown rank-one matrix with prior structural information from noisy observations. While computing the Bayes optimal estimator is intractable in general due to the requirement of computing high-dimensional integrations/summations, Approximate Message Passing (AMP) emerges as an efficient first-order method to approximate the Bayes optimal estimator. However, the theoretical underpinnings of AMP remain largely unavailable when it starts from random initialization, a scheme of critical practical utility. Focusing on a prototypical model called Z 2 synchronization, we characterize the finite-sample dynamics of AMP from random initialization, uncovering its rapid global convergence. Our theory—which is nonasymptotic in nature—in this model unveils the non-necessity of a careful initialization for the success of AMP.
more » « less
Full Text Available

« Prev Next »

Search for: All records