NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Last Iterate Risk Bounds of SGD with Decaying Stepsize for Overparameterized Linear Regression

Wu, Jingfeng; Zou, Difan; Braverman, Vladimir; Gu, Quanquan; Kakade, Sham (January 2022, Proceedings of Machine Learning Research)

Full Text Available
Understanding Contrastive Learning Requires Incorporating Inductive Biases

Saunshi, Nikunj; Ash, Jordan; Goel, Surbhi; Misra, Dipendra; Zhang, Cyril; Arora, Sanjeev; Kakade, Sham; Krishnamurthy, Akshay (January 2022, Proceedings of Machine Learning Research)

Full Text Available
Gone Fishing: Neural Active Learning with Fisher Embeddings

Ash, Jordan; Goel, Surbhi; Krishnamurthy, Akshay; Kakade, Sham (January 2021, Advances in neural information processing systems)

Full Text Available
An Exponential Lower Bound for Linearly Realizable MDP with Constant Suboptimality Gap

Wang, Yuanhao; Wang, Ruosong; Kakade, Sham (January 2021, Advances in neural information processing systems)

Full Text Available
Going Beyond Linear RL: Sample Efficient Neural Function Approximation

Huang, Baihe; Huang, Kaixuan; Kakade, Sham; Lee, Jason D; Lei, Qi; Wang, Runzhe; Yang, Jiaqi (January 2021, Advances in neural information processing systems)

Full Text Available
Robust and differentially private mean estimation

Liu, Xiyang; Kong, Weihao; Kakade, Sham; Oh, Sewoong (January 2021, Advances in neural information processing systems)

Full Text Available
What are the Statistical Limits of Offline RL with Linear Function Approximation?

Wang, Ruosong; Foster, Dean; Kakade, Sham M. (January 2021, International Conference on Learning Representations)

Offline reinforcement learning seeks to utilize offline (observational) data to guide the learning of (causal) sequential decision making strategies. The hope is that offline reinforcement learning coupled with function approximation methods (to deal with the curse of dimensionality) can provide a means to help alleviate the excessive sample complexity burden in modern sequential decision making problems. However, the extent to which this broader approach can be effective is not well understood, where the literature largely consists of sufficient conditions. This work focuses on the basic question of what are necessary representational and distributional conditions that permit provable sample-efficient offline reinforcement learning. Perhaps surprisingly, our main result shows that even if: i) we have realizability in that the true value function of \emph{every} policy is linear in a given set of features and 2) our off-policy data has good coverage over all features (under a strong spectral condition), any algorithm still (information-theoretically) requires a number of offline samples that is exponential in the problem horizon to non-trivially estimate the value of \emph{any} given policy. Our results highlight that sample-efficient offline policy evaluation is not possible unless significantly stronger conditions hold; such conditions include either having low distribution shift (where the offline data distribution is close to the distribution of the policy to be evaluated) or significantly stronger representational conditions (beyond realizability).
more » « less
Full Text Available
The Benefits of Implicit Regularization from SGD in Least Squares Problems

Zou, Difan; Wu, Jingfeng; Braverman, Vladimir; Gu, Quanquan; Foster, Dean P.; Kakade, Sham (January 2021, Advances in neural information processing systems)

Full Text Available
LLC: Accurate, Multi-purpose Learnt Low-dimensional Binary Codes

Kusupati, Aditya; Wallingford, Matthew; Ramanujan, Vivek; Somani, Raghav; Park, Jae Sung; Pillutla, Krishna; Jain, Prateek; Kakade, Sham; Farhadi, Ali (January 2021, Advances in neural information processing systems)

Full Text Available
Optimal Gradient-based Algorithms for Non-concave Bandit Optimization

Huang, Baihe; Huang, Kaixuan; Kakade, Sham; Lee, Jason D; Lei, Qi; Wang, Runzhe; Yang, Jiaqi (January 2021, Advances in neural information processing systems)

Full Text Available

« Prev Next »

Search for: All records