Control Regularization for Reduced Variance Reinforcement Learning

Cheng, Richard Cheng; Verma, Abhinav; Orosz, Gabor; Chaudhuri, Swarat; Yue, Yisong; Burdick, Joel

Citation Details

Dealing with high variance is a significant challenge in model-free reinforcement learning (RL). Existing methods are unreliable, exhibiting high variance in performance from run to run using different initializations/seeds. Focusing on problems arising in continuous control, we propose a functional regularization approach to augmenting model-free RL. In particular, we regularize the behavior of the deep policy to be similar to a control prior, i.e., we regularize in function space. We show that functional regularization yields a bias-variance trade-off, and propose an adaptive tuning strategy to optimize this trade-off. When the prior policy has control-theoretic stability guarantees, we further show that this regularization approximately preserves those stability guarantees throughout learning. We validate our approach empirically on a wide range of settings, and demonstrate significantly reduced variance, guaranteed dynamic stability, and more efficient learning than deep RL alone. more »

Award ID(s):: 1704883

PAR ID:: 10100393

Author(s) / Creator(s):: Cheng, Richard Cheng; Verma, Abhinav; Orosz, Gabor; Chaudhuri, Swarat; Yue, Yisong; Burdick, Joel

Date Published:: 2019-01-01

Journal Name:: Proceedings of Machine Learning Research

Volume:: 97

ISSN:: 2640-3498

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.

More Like this