NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Neurosymbolic Reinforcement Learning with Formally Verified Exploration

Anderson, Greg; Verma, Abhinav; Dillig, Isil; Chaudhuri, Swarat (January 2021, 34th Conference on Neural Information Processing Systems)

Full Text Available
Neurosymbolic Reinforcement Learning with Formally Verified Exploration

Anderson, Greg; Verma, Abhinav; Dillig, Isil; Chaudhuri, Swarat (October 2020, Neural Information Processing Systems)
null (Ed.)
We present Revel, a partially neural reinforcement learning (RL) framework for provably safe exploration in continuous state and action spaces. A key challenge for provably safe deep RL is that repeatedly verifying neural networks within a learning loop is computationally infeasible. We address this challenge using two policy classes: a general, neurosymbolic class with approximate gradients and a more restricted class of symbolic policies that allows efficient verification. Our learning algorithm is a mirror descent over policies: in each iteration, it safely lifts a symbolic policy into the neurosymbolic space, performs safe gradient updates to the resulting policy, and projects the updated policy into the safe symbolic subset, all without requiring explicit verification of neural networks. Our empirical results show that Revel enforces safe exploration in many scenarios in which Constrained Policy Optimization does not, and that it can discover policies that outperform those learned through prior approaches to verified exploration.
more » « less
Full Text Available
Neurosymbolic Reinforcement Learning with Formally Verified Exploration

Anderson, Greg; Verma, Abhinav; Dillig, Isil; Chaudhuri, Swarat (January 2020, Advances in neural information processing systems)
null (Ed.)
Full Text Available
Learning Differentiable Programs with Admissible Neural Heuristics

Shah, Ameesh; Zhan, Eric; Sun, Jennifer; Verma, Abhinav; Yue, Yisong; Chaudhuri, Swarat (January 2020, Advances in neural information processing systems)
null (Ed.)
Full Text Available
Control Regularization for Reduced Variance Reinforcement Learning

Cheng, Richard; Verma, Abhinav; Orosz, Gabor; Chaudhuri, Swarat; Yue, Yisong; Burdick, Joel W (June 2019, Thirty-sixth International Conference on Machine Learning)
null (Ed.)
Full Text Available
Control Regularization for Reduced Variance Reinforcement Learning

Cheng, Richard Cheng; Verma, Abhinav; Orosz, Gabor; Chaudhuri, Swarat; Yue, Yisong; Burdick, Joel (January 2019, Proceedings of Machine Learning Research)

Dealing with high variance is a significant challenge in model-free reinforcement learning (RL). Existing methods are unreliable, exhibiting high variance in performance from run to run using different initializations/seeds. Focusing on problems arising in continuous control, we propose a functional regularization approach to augmenting model-free RL. In particular, we regularize the behavior of the deep policy to be similar to a control prior, i.e., we regularize in function space. We show that functional regularization yields a bias-variance trade-off, and propose an adaptive tuning strategy to optimize this trade-off. When the prior policy has control-theoretic stability guarantees, we further show that this regularization approximately preserves those stability guarantees throughout learning. We validate our approach empirically on a wide range of settings, and demonstrate significantly reduced variance, guaranteed dynamic stability, and more efficient learning than deep RL alone.
more » « less
Full Text Available

Search for: All records