Most of the cybersecurity research focus on either presenting a specific vulnerability %or hacking technique, or proposing a specific defense algorithm to defend against a well-defined attack scheme. Although such cybersecurity research is important, few have paid attention to the dynamic interactions between attackers and defenders, where both sides are intelligent and will dynamically change their attack or defense strategies in order to gain the upper hand over their opponents. This 'cyberwar' phenomenon exists among most cybersecurity incidents in the real world, which warrants special research and analysis. In this paper, we propose a dynamic game theoretic framework (i.e., hyper defense) to analyze the interactions between the attacker and the defender as a non-cooperative security game. The key idea is to model attackers/defenders to have multiple levels of attack/defense strategies that are different in terms of effectiveness, strategy costs, and attack gains/damages. Each player adjusts his strategy based on the strategy's cost, potential attack gain/damage, and effectiveness in anticipating of the opponent's strategy. We study the achievable Nash equilibrium for the attacker-defender security game where the players employ an efficient strategy according to the obtained equilibrium. Furthermore, we present case studies of three different types of network attacks and put forth how our hyper defense system can successfully model them. Simulation results show that the proposed game theoretical system achieves a better performance compared to two other fixed-strategy defense systems.
more »
« less
QFlip: An Adaptive Reinforcement Learning Strategy for the FlipIt Security Game
A rise in Advanced Persistent Threats (APTs) has introduced a need for robustness against long-running, stealthy attacks which circumvent existing cryptographic security guarantees. FlipIt is a security game that models attacker-defender interactions in advanced scenarios such as APTs. Previous work analyzed extensively non-adaptive strategies in FlipIt, but adaptive strategies rise naturally in practical interactions as players receive feedback during the game. We model the FlipIt game as a Markov Decision Process and introduce QFlip, an adaptive strategy for FlipIt based on temporal difference reinforcement learning. We prove theoretical results on the convergence of our new strategy against an opponent playing with a Periodic strategy. We con firm our analysis experimentally by extensive evaluation of QFlip against speci fic opponents. QFlip converges to the optimal adaptive strategy for Peri- odic and Exponential opponents using associated state spaces. Finally, we introduce a generalized QFlip strategy with composite state space that outperforms a Greedy strategy for several distributions including Periodic and Uniform, without prior knowledge of the opponent's strategy. We also release an OpenAI Gym environment for FlipIt to facilitate future research.
more »
« less
- Award ID(s):
- 1717634
- PAR ID:
- 10176641
- Date Published:
- Journal Name:
- Lecture notes in computer science
- Volume:
- 11836
- ISSN:
- 0302-9743
- Page Range / eLocation ID:
- 364-384
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Guruswami, Venkatesan (Ed.)Gameplay under various forms of uncertainty has been widely studied. Feldman et al. [Michal Feldman et al., 2010] studied a particularly low-information setting in which one observes the opponent’s actions but no payoffs, not even one’s own, and introduced an algorithm which guarantees one’s payoff nonetheless approaches the minimax optimal value (i.e., zero) in a symmetric zero-sum game. Against an opponent playing a minimax-optimal strategy, approaching the value of the game is the best one can hope to guarantee. However, a wealth of research in behavioral economics shows that people often do not make perfectly rational, optimal decisions. Here we consider whether it is possible to actually win in this setting if the opponent is behaviorally biased. We model several deterministic, biased opponents and show that even without knowing the game matrix in advance or observing any payoffs, it is possible to take advantage of each bias in order to win nearly every round (so long as the game has the property that each action beats and is beaten by at least one other action). We also provide a partial characterization of the kinds of biased strategies that can be exploited to win nearly every round, and provide algorithms for beating some kinds of biased strategies even when we don't know which strategy the opponent uses.more » « less
-
Although security games have attracted intensive research attention over the past years, few existing works consider how information from local communities would affect the game. In this paper, we introduce a new player -- a strategic informant, who can observe and report upcoming attacks -- to the defender-attacker security game setting. Characterized by a private type, the informant has his utility structure that leads to his strategic behaviors. We model the game as a 3-player extensive-form game and propose a novel solution concept of Strong Stackelberg-perfect Bayesian equilibrium. To compute the optimal defender strategy, we first show that although the informant can have infinitely many types in general, the optimal defense plan can only include a finite (exponential) number of different patrol strategies. We then prove that there exists a defense plan with only a linear number of patrol strategies that achieve the optimal defender's utility, which significantly reduces the computational burden and allows us to solve the game in polynomial time using linear programming. Finally, we conduct extensive experiments to show the effect of the strategic informant and demonstrate the effectiveness of our algorithm.more » « less
-
Deep brain stimulation (DBS) is a commonly used treatment for medication resistant Parkinson’s disease and is an emerging treatment for other neurological disorders. More recently, phase-specific adaptive DBS (aDBS), whereby the application of stimulation is locked to a particular phase of tremor, has been proposed as a strategy to improve therapeutic efficacy and decrease side effects. In this work, in the context of these phase-specific aDBS strategies, we investigate the dynamical behavior of large populations of coupled neurons in response to near-periodic stimulation, namely, stimulation that is periodic except for a slowly changing amplitude and phase offset that can be used to coordinate the timing of applied input with a specified phase of model oscillations. Using an adaptive phase-amplitude reduction strategy, we illustrate that for a large population of oscillatory neurons, the temporal evolution of the associated phase distribution in response to near-periodic forcing can be captured using a reduced order model with four state variables. Subsequently, we devise and validate a closed-loop control strategy to disrupt synchronization caused by coupling. Additionally, we identify strategies for implementing the proposed control strategy in situations where underlying model equations are unavailable by estimating the necessary terms of the reduced order equations in real-time from observables.more » « less
-
The increasing penetration of cyber systems into smart grids has resulted in these grids being more vulnerable to cyber physical attacks. The central challenge of higher order cyber-physical contingency analysis is the exponential blow-up of the attack surface due to a large number of attack vectors. This gives rise to computational challenges in devising efficient attack mitigation strategies. However, a system operator can leverage private information about the underlying network to maintain a strategic advantage over an adversary equipped with superior computational capability and situational awareness. In this work, we examine the following scenario: A malicious entity intrudes the cyber-layer of a power network and trips the transmission lines. The objective of the system operator is to deploy security measures in the cyber-layer to minimize the impact of such attacks. Due to budget constraints, the attacker and the system operator have limits on the maximum number of transmission lines they can attack or defend. We model this adversarial interaction as a resource-constrained attacker-defender game. The computational intractability of solving large security games is well known. However, we exploit the approximately modular behavior of an impact metric known as the disturbance value to arrive at a linear-time algorithm for computing an optimal defense strategy. We validate the efficacy of the proposed strategy against attackers of various capabilities and provide an algorithm for a real-time implementation.more » « less
An official website of the United States government

