NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Multiplayer Performative Prediction: Learning in Decision-Dependent Games

Narang, Adhyyan; Faulkner, Evan; Drusvyatskiy, Dmitriy; Fazel, Maryam; Ratliff, Lillian J. (July 2023, Journal of machine learning research)

Full Text Available
Instance-dependent Sample Complexity Bounds for Zero-sum Matrix Games

Maiti, Arnab; Jamieson, Kevin; Ratliff, Lillian J. (January 2023, Proceedings of Machine Learning Research)
Francisco Ruiz, Jennifer Dy (Ed.)
We study the sample complexity of identifying an approximate equilibrium for two-player zero-sum n × 2 matrix games. That is, in a sequence of repeated game plays, how many rounds must the two players play before reaching an approximate equilibrium (e.g., Nash)? We derive instance-dependent bounds that define an ordering over game matrices that captures the intuition that the dynamics of some games converge faster than others. Specifically, we consider a stochastic observation model such that when the two players choose actions i and j, respectively, they both observe each other’s played actions and a stochastic observation Xij such that E [Xij ] = Aij . To our knowledge, our work is the first case of instance-dependent lower bounds on the number of rounds the players must play before reaching an approximate equilibrium in the sense that the number of rounds depends on the specific properties of the game matrix A as well as the desired accuracy. We also prove a converse statement: there exist player strategies that achieve this lower bound.
more » « less
Full Text Available
Decision-Dependent Risk Minimization in Geometrically Decaying Dynamic Environments

Mitas Ray; Lillian J. Ratliff; Dmitriy Drusvyatskiy; Maryam Fazel (April 2022, Proceedings of the AAAI Conference on Artificial Intelligence)

Full Text Available
Reward-Free RL is No Harder Than Reward-Aware RL in Linear Markov Decision Processes

Wagenmaker, Andrew J; Chen, Yifang; Simchowitz, Max; Du, Simon; Jamieson, Kevin (January 2022, Proceedings of Machine Learning Research)

Full Text Available
Instance-Dependent Near-Optimal Policy Identification in Linear MDPs via Online Experiment Design

Wagenmaker, Andrew; Jamieson, Kevin (January 2022, Advances in neural information processing systems)
Koyejo, S.; Mohamed, S.; Agarwal, A.; Belgrave, D.; Cho, K.; Oh, A. (Ed.)
While much progress has been made in understanding the minimax sample complexity of reinforcement learning (RL)—the complexity of learning on the “worst-case” instance—such measures of complexity often do not capture the true difficulty of learning. In practice, on an “easy” instance, we might hope to achieve a complexity far better than that achievable on the worst-case instance. In this work we seek to understand the “instance-dependent” complexity of learning near-optimal policies (PAC RL) in the setting of RL with linear function approximation. We propose an algorithm, Pedel, which achieves a fine-grained instance-dependent measure of complexity, the first of its kind in the RL with function approximation setting, thereby capturing the difficulty of learning on each particular problem instance. Through an explicit example, we show that Pedel yields provable gains over low-regret, minimax-optimal algorithms and that such algorithms are unable to hit the instance-optimal rate. Our approach relies on a novel online experiment design-based procedure which focuses the exploration budget on the “directions” most relevant to learning a near-optimal policy, and may be of independent interest.
more » « less
Full Text Available
Instance-optimal PAC Algorithms for Contextual Bandits

Li, Zhaoqi; Ratliff, Lillian; Nassif, Houssam; Jamieson, Kevin; Jain, Lalit (January 2022, Advances in neural information processing systems)
Koyejo, S.; Mohamed, S.; Agarwal, A.; Belgrave, D.; Cho, K.; Oh, A. (Ed.)
In the stochastic contextual bandit setting, regret-minimizing algorithms have been extensively researched, but their instance-minimizing best-arm identification counterparts remain seldom studied. In this work, we focus on the stochastic bandit problem in the (ǫ, δ)-PAC setting: given a policy class Π the goal of the learner is to return a policy π ∈ Π whose expected reward is within ǫ of the optimal policy with probability greater than 1 − δ. We characterize the first instance-dependent PAC sample complexity of contextual bandits through a quantity ρΠ, and provide matching upper and lower bounds in terms of ρΠ for the agnostic and linear contextual best-arm identification settings. We show that no algorithm can be simultaneously minimax-optimal for regret minimization and instance-dependent PAC for best-arm identification. Our main result is a new instance-optimal and computationally efficient algorithm that relies on a polynomial number of calls to an argmax oracle.
more » « less
Full Text Available
First-Order Regret in Reinforcement Learning with Linear Function Approximation: A Robust Estimation Approach

Wagenmaker, Andrew J; Chen, Yifang; Simchowitz, Max; Du, Simon; Jamieson, Kevin (January 2022, Proceedings of Machine Learning Research)

Full Text Available
Learning in Stochastic Monotone Games with Decision-Dependent Data

Adhyyan Narang; Evan Faulkner; Dmitriy Drusvyatskiy; Maryam Fazel; Lillian J. Ratlif (January 2022, International Conference on Artificial Intelligence and Statistics (AISTATS))

Full Text Available
High-Dimensional Experimental Design and Kernel Bandits

Camilleri, Romain; Jamieson, Kevin; Katz-Samuels, Julian (January 2021, Proceedings of Machine Learning Research)

Full Text Available
Improved corruption robust algorithms for episodic reinforcement learning

Chen, Yifang; Du, Simon S; Jamieson, Kevin (January 2021, Proceedings of Machine Learning Research)
null (Ed.)
Full Text Available

« Prev Next »

Search for: All records