NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

THE ROLE OF COVERAGE IN ONLINE REINFORCEMENT LEARNING

Xie, T; Foster, D; Bai, Y; Jiang, N; Kakade, S. (May 2023, Proceedings of the Eleventh International Conference on Learning Representations)

Full Text Available
Hardness of Independent Learning and Sparse Equilibrium Computation in Markov Games

Foster, D.; Golowich, N.; Kakade, S. (January 2023, Proceedings of the International Conference on Machine Learning)

Full Text Available
The statistical complexity of interactive decision making

Foster, D.; Kakade, S.; Qian, J; Rakhlin, A (January 2021, arXiv preprint)

Full Text Available
Independent Policy Gradient Methods for Competitive Reinforcement Learning

Daskalakis, C; Foster, D; Golowich, N (January 2020, 34th Annual Conference on Neural Information Processing Systems (NeurIPS), NeurIPS 2020)
null (Ed.)
We obtain global, non-asymptotic convergence guarantees for independent learning algorithms in competitive reinforcement learning settings with two agents (i.e., zero-sum stochastic games). We consider an episodic setting where in each episode, each player independently selects a policy and observes only their own actions and rewards, along with the state. We show that if both players run policy gradient methods in tandem, their policies will converge to a min-max equilibrium of the game, as long as their learning rates follow a two-timescale rule (which is necessary). To the best of our knowledge, this constitutes the first finite-sample convergence result for independent policy gradient methods in competitive RL; prior work has largely focused on centralized, coordinated procedures for equilibrium computation.
more » « less
Full Text Available
Implications of Programming Language Selection for Serverless Data Processing Pipelines

https://doi.org/10.1109/DASC-PICom-CBDCom-CyberSciTech49142.2020.00120

Cordingly, R.; Yu, H.; Hoang, V.; Perez, D.; Foster, D.; Sadeghi, Z.; Hatchett, R.; Lloyd, W. (August 2020, 2020 6th IEEE International Conference on Cloud and Big Data Computing (CBDCOM 2020))
null (Ed.)
Full Text Available

Search for: All records