skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Vulnerability Analysis for Safe Reinforcement Learning in Cyber-Physical Systems
Safe reinforcement learning (safe RL) has been applied to synthesize control policies that maximize task rewards while adhering to safety constraints within simulated secure cyber-physical systems. However, the vulnerability of safe RL to adversarial attacks remains largely unexplored. We argue that understanding the safety vulnerabilities of learned control policies is crucial for ensuring true safety in real-world scenarios. To address this gap, we first formally define the safe RL problem with formal language (Signal temporal logic), and demonstrate that even optimal policies are susceptible to observation perturbations. We then introduce novel safety violation attacks that exploit adversarial models trained with reversed safety constraints to induce unsafe behaviors. Lastly, through both theoretical analysis and experimental results, we demonstrate that our approach is more effective at violating safety constraints than existing adversarial RL methods, which primarily focus on reducing task rewards rather than compromising safety.  more » « less
Award ID(s):
2442914 2333980
PAR ID:
10670644
Author(s) / Creator(s):
; ;
Publisher / Repository:
ACM
Date Published:
Journal Name:
ACM Transactions on Cyber-Physical Systems
ISSN:
2378-962X
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Safe Reinforcement Learning (safe RL) has been widely used in safety-critical cyber-physical systems (CPS) to achieve task goals while satisfying safety constraints. Analyzing vulnerabilities that can be exploited to violate safety (i.e., safety-violated vulnerabilities) is crucial for understanding and improving the robustness of safe RL policies in CPS. However, existing works are inadequate for addressing such vulnerabilities, as they either focus on vulnerabilities that merely degrade task performance (rather than causing safety violations) or rely on strong assumptions about an adversary’s capability (e.g., requiring explicit knowledge of the safety constraints). This paper aims to bridge this gap by studying safety-violated vulnerabilities of safe RL in CPS without requiring prior knowledge of the underlying safety constraints. To this end, we propose a novel adversarial framework based on Signal Temporal Logic (STL) mining. The framework first mines STL formulas to uncover the implicit safety constraints of a safe RL policy, and then synthesizes perturbation attacks that violate these constraints. The generated attacks can effectively and efficiently induce safety violations by adapting perturbations and identifying critical time intervals for applying them. We conduct extensive experiments across multiple CPS environments, and the results demonstrate the effectiveness and efficiency of our method. 
    more » « less
  2. Cyber-Physical Systems(CPS) are the integration of sensing, control, computation, and networking with physical components and infrastructure connected by the internet. The autonomy and reliability are enhanced by the recent development of safe reinforcement learning (safe RL). However, the vulnerability of safe RL to adversarial conditions has received minimal exploration. In order to truly ensure safety in physical world applications, it is crucial to understand and address these potential safety weaknesses in learned control policies. In this work, we demonstrate a novel attack to violate safety that induces unsafe behaviors by adversarial models trained using reversed safety constraints. The experiment results show that the proposed method is more effective than existing works. 
    more » « less
  3. Reinforcement Learning (RL) agents in the real world must satisfy safety constraints in addition to maximizing a reward objective. Model-based RL algorithms hold promise for reducing unsafe real-world actions: they may synthesize policies that obey all constraints using simulated samples from a learned model. However, imperfect models can result in real-world constraint violations even for actions that are predicted to satisfy all constraints. We propose Conservative and Adaptive Penalty (CAP), a model-based safe RL framework that accounts for potential modeling errors by capturing model uncertainty and adaptively exploiting it to balance the reward and the cost objectives. First, CAP inflates predicted costs using an uncertainty-based penalty. Theoretically, we show that policies that satisfy this conservative cost constraint are guaranteed to also be feasible in the true environment. We further show that this guarantees the safety of all intermediate solutions during RL training. Further, CAP adaptively tunes this penalty during training using true cost feedback from the environment. We evaluate this conservative and adaptive penalty-based approach for model-based safe RL extensively on state and image-based environments. Our results demonstrate substantial gains in sample-efficiency while incurring fewer violations than prior safe RL algorithms. Code is available at: https://github.com/Redrew/CAP 
    more » « less
  4. While robust optimal control theory provides a rigorous framework to compute robot control policies that are provably safe, it struggles to scale to high- dimensional problems, leading to increased use of deep learning for tractable synthesis of robot safety. Unfortunately, existing neural safety synthesis methods often lack convergence guarantees and solution interpretability. In this paper, we present Minimax Actors Guided by Implicit Critic Stackelberg (MAGICS), a novel adversarial reinforcement learning (RL) algorithm that guarantees local convergence to a minimax equilibrium solution. We then build on this approach to provide local convergence guarantees for a general deep RL-based robot safety synthesis algorithm. Through both simulation studies on OpenAI Gym environ- ments and hardware experiments with a 36-dimensional quadruped robot, we show that MAGICS can yield robust control policies outperforming the state- of-the-art neural safety synthesis methods. 
    more » « less
  5. Matni, Nikolai; Morari, Manfred; Pappas, George J. (Ed.)
    Safe reinforcement learning (RL) with assured satisfaction of hard state constraints during training has recently received a lot of attention. Safety filters, e.g., based on control barrier functions (CBFs), provide a promising way for safe RL via modifying the unsafe actions of an RL agent on the fly. Existing safety filter-based approaches typically involve learning of uncertain dynamics and quantifying the learned model error, which leads to conservative filters before a large amount of data is collected to learn a good model, thereby preventing efficient exploration. This paper presents a method for safe and efficient RL using disturbance observers (DOBs) and control barrier functions (CBFs). Unlike most existing safe RL methods that deal with hard state constraints, our method does not involve model learning, and leverages DOBs to accurately estimate the pointwise value of the uncertainty, which is then incorporated into a robust CBF condition to generate safe actions. The DOB-based CBF can be used as a safety filter with model-free RL algorithms by minimally modifying the actions of an RL agent whenever necessary to ensure safety throughout the learning process. Simulation results on a unicycle and a 2D quadrotor demonstrate that the proposed method outperforms a state-of-the-art safe RL algorithm using CBFs and Gaussian processes-based model learning, in terms of safety violation rate, and sample and computational efficiency. 
    more » « less