Vulnerability Analysis for Safe Reinforcement Learning in Cyber-Physical Systems

Jiang, Shixiong (ORCID:0009000491372359); Liu, Mengyu (ORCID:0000000235329506); Kong, Fanxin (ORCID:0000000164883488)

doi:10.1145/3788281

Citation Details

Vulnerability Analysis for Safe Reinforcement Learning in Cyber-Physical Systems

Safe reinforcement learning (safe RL) has been applied to synthesize control policies that maximize task rewards while adhering to safety constraints within simulated secure cyber-physical systems. However, the vulnerability of safe RL to adversarial attacks remains largely unexplored. We argue that understanding the safety vulnerabilities of learned control policies is crucial for ensuring true safety in real-world scenarios. To address this gap, we first formally define the safe RL problem with formal language (Signal temporal logic), and demonstrate that even optimal policies are susceptible to observation perturbations. We then introduce novel safety violation attacks that exploit adversarial models trained with reversed safety constraints to induce unsafe behaviors. Lastly, through both theoretical analysis and experimental results, we demonstrate that our approach is more effective at violating safety constraints than existing adversarial RL methods, which primarily focus on reducing task rewards rather than compromising safety. more »

Award ID(s):: 2442914 2333980

PAR ID:: 10670644

Author(s) / Creator(s):: Jiang, Shixiong; Liu, Mengyu; Kong, Fanxin

Publisher / Repository:: ACM

Date Published:: 2026-01-19

Journal Name:: ACM Transactions on Cyber-Physical Systems

ISSN:: 2378-962X

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript
Journal Article:
https://doi.org/10.1145/3788281

More Like this