skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 11:00 PM ET on Friday, November 14 until 2:00 AM ET on Saturday, November 15 due to maintenance. We apologize for the inconvenience.


Title: Safe and Efficient Reinforcement Learning Using Disturbance-Observer-Based Control Barrier Functions
Safe reinforcement learning (RL) with assured satisfaction of hard state constraints during training has recently received a lot of attention. Safety filters, e.g., based on control barrier functions (CBFs), provide a promising way for safe RL via modifying the unsafe actions of an RL agent on the fly. Existing safety filter-based approaches typically involve learning of uncertain dynamics and quantifying the learned model error, which leads to conservative filters before a large amount of data is collected to learn a good model, thereby preventing efficient exploration. This paper presents a method for safe and efficient RL using disturbance observers (DOBs) and control barrier functions (CBFs). Unlike most existing safe RL methods that deal with hard state constraints, our method does not involve model learning, and leverages DOBs to accurately estimate the pointwise value of the uncertainty, which is then incorporated into a robust CBF condition to generate safe actions. The DOB-based CBF can be used as a safety filter with model-free RL algorithms by minimally modifying the actions of an RL agent whenever necessary to ensure safety throughout the learning process. Simulation results on a unicycle and a 2D quadrotor demonstrate that the proposed method outperforms a state-of-the-art safe RL algorithm using CBFs and Gaussian processes-based model learning, in terms of safety violation rate, and sample and computational efficiency.  more » « less
Award ID(s):
2135925 2133656
PAR ID:
10427348
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Proceedings of Machine Learning Research
Volume:
211
ISSN:
2640-3498
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Matni, Nikolai; Morari, Manfred; Pappas, George J. (Ed.)
    Safe reinforcement learning (RL) with assured satisfaction of hard state constraints during training has recently received a lot of attention. Safety filters, e.g., based on control barrier functions (CBFs), provide a promising way for safe RL via modifying the unsafe actions of an RL agent on the fly. Existing safety filter-based approaches typically involve learning of uncertain dynamics and quantifying the learned model error, which leads to conservative filters before a large amount of data is collected to learn a good model, thereby preventing efficient exploration. This paper presents a method for safe and efficient RL using disturbance observers (DOBs) and control barrier functions (CBFs). Unlike most existing safe RL methods that deal with hard state constraints, our method does not involve model learning, and leverages DOBs to accurately estimate the pointwise value of the uncertainty, which is then incorporated into a robust CBF condition to generate safe actions. The DOB-based CBF can be used as a safety filter with model-free RL algorithms by minimally modifying the actions of an RL agent whenever necessary to ensure safety throughout the learning process. Simulation results on a unicycle and a 2D quadrotor demonstrate that the proposed method outperforms a state-of-the-art safe RL algorithm using CBFs and Gaussian processes-based model learning, in terms of safety violation rate, and sample and computational efficiency. 
    more » « less
  2. Control Barrier Functions (CBFs) are an effective methodology to ensure safety and performative efficacy in real-time control applications such as power systems, resource allocation, autonomous vehicles, robotics, etc. This approach ensures safety independently of the high-level tasks that may have been pre-planned off-line. For example, CBFs can be used to guarantee that a vehicle will remain in its lane. However, when the number of agents is large, computation of CBFs can suffer from the curse of dimensionality in the multi-agent setting. In this work, we present Mean-field Control Barrier Functions (MF-CBFs), which extends the CBF framework to the mean-field (or swarm control) setting. The core idea is to model a population of agents as probability measures in the state space and build corresponding control barrier functions. Similar to traditional CBFs, we derive safety constraints on the (distributed) controls but now relying on the differential calculus in the space of probability measures. 
    more » « less
  3. In this study, we address the problem of safe control in systems subject to state and input constraints by integrating the Control Barrier Function (CBF) into the Model Predictive Control (MPC) formulation. While CBF offers a conservative policy and traditional MPC lacks the safety guarantee beyond the finite horizon, the proposed scheme takes advantage of both MPC and CBF approaches to provide a guaranteed safe control policy with reduced conservatism and a shortened horizon. The proposed methodology leverages the sum-of-square (SOS) technique to construct CBFs that make forward invariant safe sets in the state space that are then used as a terminal constraint on the last predicted state. CBF invariant sets cover the state space around system fixed points. These islands of forward invariant CBF sets will be connected to each other using MPC. To do this, we proposed a technique to handle the MPC optimization problem subject to the combination of intersections and union of constraints. Our approach, termed Model Predictive Control Barrier Functions (MPCBF), is validated using numerical examples to demonstrate its efficacy, showing improved performance compared to classical MPC and CBF. 
    more » « less
  4. Real-time controllers must satisfy strict safety requirements. Recently, Control Barrier Functions (CBFs) have been proposed that guarantee safety by ensuring that a suitablydefined barrier function remains bounded for all time. The CBF method, however, has only been developed for deterministic systems and systems with worst-case disturbances and uncertainties. In this paper, we develop a CBF framework for safety of stochastic systems. We consider complete information systems, in which the controller has access to the exact system state, as well as incomplete information systems where the state must be reconstructed from noisy measurements. In the complete information case, we formulate a notion of barrier functions that leads to sufficient conditions for safety with probability 1. In the incomplete information case, we formulate barrier functions that take an estimate from an extended Kalman filter as input, and derive bounds on the probability of safety as a function of the asymptotic error in the filter. We show that, in both cases, the sufficient conditions for safety can be mapped to linear constraints on the control input at each time, enabling the development of tractable optimization-based controllers that guarantee safety, performance, and stability. Our approach is evaluated via simulation study on an adaptive cruise control case study. 
    more » « less
  5. null (Ed.)
    Modern nonlinear control theory seeks to develop feedback controllers that endow systems with properties such as safety and stability. The guarantees ensured by these controllers often rely on accurate estimates of the system state for determining control actions. In practice, measurement model uncertainty can lead to error in state estimates that degrades these guarantees. In this paper, we seek to unify techniques from control theory and machine learning to synthesize controllers that achieve safety in the presence of measurement model uncertainty. We define the notion of a Measurement-Robust Control Barrier Function (MR-CBF) as a tool for determining safe control inputs when facing measurement model uncertainty. Furthermore, MR-CBFs are used to inform sampling methodologies for learning-based perception systems and quantify tolerable error in the resulting learned models. We demonstrate the efficacy of MR-CBFs in achieving safety with measurement model uncertainty on a simulated Segway system. 
    more » « less