skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Automated Adversary-in-the-Loop Cyber-Physical Defense Planning
Security of cyber-physical systems (CPS) continues to pose new challenges due to the tight integration and operational complexity of the cyber and physical components. To address these challenges, this article presents a domain-aware, optimization-based approach to determine an effective defense strategy for CPS in an automated fashion—by emulating a strategic adversary in the loop that exploits system vulnerabilities, interconnection of the CPS, and the dynamics of the physical components. Our approach builds on an adversarial decision-making model based on a Markov Decision Process (MDP) that determines the optimal cyber (discrete) and physical (continuous) attack actions over a CPS attack graph. The defense planning problem is modeled as a non-zero-sum game between the adversary and defender. We use a model-free reinforcement learning method to solve the adversary’s problem as a function of the defense strategy. We then employ Bayesian optimization (BO) to find an approximatebest-responsefor the defender to harden the network against the resulting adversary policy. This process is iterated multiple times to improve the strategy for both players. We demonstrate the effectiveness of our approach on a ransomware-inspired graph with a smart building system as the physical process. Numerical studies show that our method converges to a Nash equilibrium for various defender-specific costs of network hardening.  more » « less
Award ID(s):
2134076
PAR ID:
10487747
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
ACM
Date Published:
Journal Name:
ACM Transactions on Cyber-Physical Systems
Volume:
7
Issue:
3
ISSN:
2378-962X
Page Range / eLocation ID:
1 to 25
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Securing cyber-physical systems (CPS) like the Smart Grid against cyber attacks is making it imperative for the system defenders to plan for investing in the cybersecurity resources of cyber-physical critical infrastructure. Given the constraint of limited resources that can be invested in the cyber layer of the cyber-physical smart grid, optimal allocation of these resources has become a priority for the defenders of the grid. This paper proposes a methodology for optimizing the allocation of resources for the cybersecurity infrastructure in a smart grid using attack-defense trees and game theory. The proposed methodology uses attack-defense trees (ADTs) for analyzing the cyber-attack paths (attacker strategies) within the grid and possible defense strategies to prevent those attacks. The attack-defense strategy space (ADSS) provides a comprehensive list of interactions between the attacker and the defender of the grid. The proposed methodology uses the ADSS from the ADT analysis for a game-theoretic formulation (GTF) of attacker-defender interaction. The GTF allows us to obtain strategies for the defender in order to optimize cybersecurity resource allocation in the smart grid. The implementation of the proposed methodology is validated using a synthetic smart grid model equipped with cyber and physical components depicting the feasibility of the methodology for real-world implementation. 
    more » « less
  2. null (Ed.)
    Moving target defense (MTD) is a proactive defense approach that aims to thwart attacks by continuously changing the attack surface of a system (e.g., changing host or network configurations), thereby increasing the adversary’s uncertainty and attack cost. To maximize the impact of MTD, a defender must strategically choose when and what changes to make, taking into account both the characteristics of its system as well as the adversary’s observed activities. Finding an optimal strategy for MTD presents a significant challenge, especially when facing a resourceful and determined adversary who may respond to the defender’s actions. In this paper, we propose a multi-agent partially-observable Markov Decision Process model of MTD and formulate a two-player general-sum game between the adversary and the defender. To solve this game, we propose a multi-agent reinforcement learning framework based on the double oracle algorithm. Finally, we provide experimental results to demonstrate the effectiveness of our framework in finding optimal policies. 
    more » « less
  3. The increasing penetration of cyber systems into smart grids has resulted in these grids being more vulnerable to cyber physical attacks. The central challenge of higher order cyber-physical contingency analysis is the exponential blow-up of the attack surface due to a large number of attack vectors. This gives rise to computational challenges in devising efficient attack mitigation strategies. However, a system operator can leverage private information about the underlying network to maintain a strategic advantage over an adversary equipped with superior computational capability and situational awareness. In this work, we examine the following scenario: A malicious entity intrudes the cyber-layer of a power network and trips the transmission lines. The objective of the system operator is to deploy security measures in the cyber-layer to minimize the impact of such attacks. Due to budget constraints, the attacker and the system operator have limits on the maximum number of transmission lines they can attack or defend. We model this adversarial interaction as a resource-constrained attacker-defender game. The computational intractability of solving large security games is well known. However, we exploit the approximately modular behavior of an impact metric known as the disturbance value to arrive at a linear-time algorithm for computing an optimal defense strategy. We validate the efficacy of the proposed strategy against attackers of various capabilities and provide an algorithm for a real-time implementation. 
    more » « less
  4. This paper studies the satisfaction of a class of temporal properties for cyber-physical systems (CPSs) over a finite-time horizon in the presence of an adversary, in an environment described by discretetime dynamics. The temporal logic specification is given in safe−LTLF , a fragment of linear temporal logic over traces of finite length. The interaction of the CPS with the adversary is modeled as a two-player zerosum discrete-time dynamic stochastic game with the CPS as defender. We formulate a dynamic programming based approach to determine a stationary defender policy that maximizes the probability of satisfaction of a safe − LTLF formula over a finite time-horizon under any stationary adversary policy. We introduce secure control barrier certificates (S-CBCs), a generalization of barrier certificates and control barrier certificates that accounts for the presence of an adversary, and use S-CBCs to provide a lower bound on the above satisfaction probability. When the dynamics of the evolution of the system state has a specific underlying structure, we present a way to determine an S-CBC as a polynomial in the state variables using sum-of-squares optimization. An illustrative example demonstrates our approach. 
    more » « less
  5. null (Ed.)
    Growing multi-stage attacks in computer networks impose significant security risks and necessitate the development of effective defense schemes that are able to autonomously respond to intrusions during vulnerability windows. However, the defender faces several real-world challenges, e.g., unknown likelihoods and unknown impacts of successful exploits. In this article, we leverage reinforcement learning to develop an innovative adaptive cyber defense to maximize the cost-effectiveness subject to the aforementioned challenges. In particular, we use Bayesian attack graphs to model the interactions between the attacker and networks. Then we formulate the defense problem of interest as a partially observable Markov decision process problem where the defender maintains belief states to estimate system states, leverages Thompson sampling to estimate transition probabilities, and utilizes reinforcement learning to choose optimal defense actions using measured utility values. The algorithm performance is verified via numerical simulations based on real-world attacks. 
    more » « less