Adaptive Discretization for Episodic Reinforcement Learning in Metric Spaces

Sinclair, Sean R.; Banerjee, Siddhartha; Lee Yu, Christina

doi:10.1145/3410048.3410059

Citation Details

Adaptive Discretization for Episodic Reinforcement Learning in Metric Spaces

We present an efficient algorithm for model-free episodic reinforcement learning on large (potentially continuous) state-action spaces. Our algorithm is based on a novel Q-learning policy with adaptive data-driven discretization. The central idea is to maintain a finer partition of the state-action space in regions which are frequently visited in historical trajectories, and have higher payoff estimates. We demonstrate how our adaptive partitions take advantage of the shape of the optimal Q-function and the joint space, without sacrificing the worst-case performance. In particular, we recover the regret guarantees of prior algorithms for continuous state-action spaces, which additionally require either an optimal discretization as input, and/or access to a simulation oracle. Moreover, experiments demonstrate how our algorithm automatically adapts to the underlying structure of the problem, resulting in much better performance compared both to heuristics and Q-learning with uniform discretization. more »

Award ID(s):: 1847393 1955997 1948256 1839346

PAR ID:: 10241021

Author(s) / Creator(s):: Sinclair, Sean R.; Banerjee, Siddhartha; Lee Yu, Christina

Date Published:: 2020-07-08

Journal Name:: ACM SIGMETRICS Performance Evaluation Review

Volume:: 48

Issue:: 1

ISSN:: 0163-5999

Page Range / eLocation ID:: 17 to 18

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Journal Article:
https://doi.org/10.1145/3410048.3410059

More Like this