Vulnerability Exploration of Safe Reinforcement Learning in Cyber-Physical Systems via STL Mining

Fan, Jiliang; Jiang, Shixiong; Liu, Mengyu; Kong, Fanxin

Safe Reinforcement Learning (safe RL) has been widely used in safety-critical cyber-physical systems (CPS) to achieve task goals while satisfying safety constraints. Analyzing vulnerabilities that can be exploited to violate safety (i.e., safety-violated vulnerabilities) is crucial for understanding and improving the robustness of safe RL policies in CPS. However, existing works are inadequate for addressing such vulnerabilities, as they either focus on vulnerabilities that merely degrade task performance (rather than causing safety violations) or rely on strong assumptions about an adversary’s capability (e.g., requiring explicit knowledge of the safety constraints). This paper aims to bridge this gap by studying safety-violated vulnerabilities of safe RL in CPS without requiring prior knowledge of the underlying safety constraints. To this end, we propose a novel adversarial framework based on Signal Temporal Logic (STL) mining. The framework first mines STL formulas to uncover the implicit safety constraints of a safe RL policy, and then synthesizes perturbation attacks that violate these constraints. The generated attacks can effectively and efficiently induce safety violations by adapting perturbations and identifying critical time intervals for applying them. We conduct extensive experiments across multiple CPS environments, and the results demonstrate the effectiveness and efficiency of our method.

More Like this