Offline Reinforcement Learning with Differentiable Function Approximation is Provably Efficient

Yin, Ming; Wang, Mengdi; Wang, Yu-Xiang

Citation Details

Offline reinforcement learning, which aims at optimizing sequential decision-making strategies with historical data, has been extensively applied in real-life applications. State-Of-The-Art algorithms usually leverage powerful function approximators (e.g. neural networks) to alleviate the sample complexity hurdle for better empirical performances. Despite the successes, a more systematic under- standing of the statistical complexity for function approximation remains lacking. Towards bridging the gap, we take a step by considering offline reinforcement learning with differentiable function class approximation (DFA). This function class naturally incorporates a wide range of models with nonlinear/nonconvex structures. We show offline RL with differentiable function approximation is provably efficient by analyzing the pessimistic fitted Q-learning (PFQL) algorithm, and our results provide the theoretical basis for understanding a variety of practical heuristics that rely on Fitted Q-Iteration style design. In addition, we further im- prove our guarantee with a tighter instance-dependent characterization. We hope our work could draw interest in studying reinforcement learning with differentiable function approximation beyond the scope of current research. more »

Award ID(s):: 2007117 2003257

PAR ID:: 10466950

Author(s) / Creator(s):: Yin, Ming; Wang, Mengdi; Wang, Yu-Xiang

Publisher / Repository:: International Conference on Learning Representation

Date Published:: 2023-02-01

Journal Name:: International Conference on Learning Representation

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Conference Paper:
The DOI is not currently available.

More Like this