skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: A Machine Learning and Optimization Framework for Efficient Alert Management in a Cybersecurity Operations Center
Cybersecurity operations centers (CSOCs) protect organizations by monitoring network traffic and detecting suspicious activities in the form of alerts. The security response team within CSOCs is responsible for investigating and mitigating alerts. However, an imbalance between alert volume and available analysts creates a backlog, putting the network at risk of exploitation. Recent research has focused on improving the alert-management process by triaging alerts, optimizing analyst scheduling, and reducing analyst workload through systematic discarding of alerts. However, these works overlook the delays caused in alert investigations by several factors, including: (i) false or benign alerts contributing to the backlog; (ii) analysts experiencing cognitive burden from repeatedly reviewing unrelated alerts; and (iii) analysts being assigned to alerts that do not match well with their expertise. We propose a novel framework that considers these factors and utilizes machine learning and mathematical optimization methods to dynamically improve throughput during work shifts. The framework achieves efficiency by automating the identification and removal of a portion of benign alerts, forming clusters of similar alerts, and assigning analysts to alerts with matching attributes. Experiments conducted using real-world CSOC data demonstrate a 60.16% reduction in the alert backlog for an 8-h work shift compared to currently employed approach.  more » « less
Award ID(s):
1822094
PAR ID:
10605871
Author(s) / Creator(s):
 ;  ;  
Publisher / Repository:
Association for Computing Machinery (ACM)
Date Published:
Journal Name:
Digital Threats: Research and Practice
Volume:
5
Issue:
2
ISSN:
2692-1626
Format(s):
Medium: X Size: p. 1-23
Size(s):
p. 1-23
Sponsoring Org:
National Science Foundation
More Like this
  1. Large enterprises are increasingly relying on threat detection softwares (e.g., Intrusion Detection Systems) to allow them to spot suspicious activities. These softwares generate alerts which must be investigated by cyber analysts to figure out if they are true attacks. Unfortunately, in practice, there are more alerts than cyber analysts can properly investigate. This leads to a “threat alert fatigue” or information overload problem where cyber analysts miss true attack alerts in the noise of false alarms. In this paper, we present NoDoze to combat this challenge using contextual and historical information of generated threat alert in an enterprise. NoDoze first generates a causal dependency graph of an alert event. Then, it assigns an anomaly score to each event in the dependency graph based on the frequency with which related events have happened before in the enterprise. NoDoze then propagates those scores along the edges of the graph using a novel network diffusion algorithm and generates a subgraph with an aggregate anomaly score which is used to triage alerts. Evaluation on our dataset of 364 threat alerts shows that NoDoze decreases the volume of false alarms by 86%, saving more than 90 hours of analysts’ time, which was required to investigate those false alarms. Furthermore, NoDoze generated dependency graphs of true alerts are 2 orders of magnitude smaller than those generated by traditional tools without sacrificing the vital information needed for the investigation. Our system has a low average runtime overhead and can be deployed with any threat detection software. 
    more » « less
  2. Many cyber attack actions can be observed but the observables often exhibit intricate feature dependencies, non-homogeneity, and potential for rare yet critical samples. This work tests the ability to model and synthesize cyber intrusion alerts through Generative Adversarial Networks (GANs), which explore the feature space through reconciling between randomly generated samples and the given data that reflects a mixture of diverse attack behaviors. Through a comprehensive analysis using Jensen-Shannon Divergence (JSD), conditional and joint entropy, and mode drops and additions, we show that the Wasserstein-GAN with Gradient Penalty and Mutual Information (WGAN-GPMI) is more effective in learning to generate realistic alerts than models without Mutual Information constraints. The added Mutual Information constraint pushes the model to explore the feature space more thoroughly and increases the generation of low probability yet critical alert features. By mapping alerts to a set of attack stages it is shown that the output of these low probability alerts has a direct contextual meaning for cyber security analysts. Overall, our results show the promising novel use of GANs to learn from limited yet diverse intrusion alerts to generate synthetic ones that emulate critical dependencies, opening the door to data driven network threat models. 
    more » « less
  3. Abstract Pharmacogenomic (PGx) biomarkers integrated using machine learning can be embedded within the electronic health record (EHR) to provide clinicians with individualized predictions of drug treatment outcomes. Currently, however, drug alerts in the EHR are largely generic (not patient‐specific) and contribute to increased clinician stress and burnout. Improving the usability of PGx alerts is an urgent need. Therefore, this work aimed to identify principles for optimal PGx alert design through a health‐system‐wide, mixed‐methods study. Clinicians representing multiple practices and care settings (N = 1062) in urban, rural, and underserved regions were invited to complete an electronic survey comparing the usability of three drug alerts for citalopram, as a case study. Alert 1 contained a generic warning of pharmacogenomic effects on citalopram metabolism. Alerts 2 and 3 provided patient‐specific predictions of citalopram efficacy with varying depth of information. Primary outcomes included the System's Usability Scale score (0–100 points) of each alert, the perceived impact of each alert on stress and decision‐making, and clinicians' suggestions for alert improvement. Secondary outcomes included the assessment of alert preference by clinician age, practice type, and geographic setting. Qualitative information was captured to provide context to quantitative information. The final cohort comprised 305 geographically and clinically diverse clinicians. A simplified, individualized alert (Alert 2) was perceived as beneficial for decision‐making and stress compared with a more detailed version (Alert 3) and the generic alert (Alert 1) regardless of age, practice type, or geographic setting. Findings emphasize the need for clinician‐guided design of PGx alerts in the era of digital medicine. 
    more » « less
  4. Cyber Intrusion alerts are commonly collected by corporations to analyze network traffic and glean information about attacks perpetrated against the network. However, datasets of true malignant alerts are rare and generally only show one potential attack scenario out of many possible ones. Furthermore, it is difficult to expand the analysis of these alerts through artificial means due to the complexity of feature dependencies within an alert and lack of rare yet critical samples. This work proposes the use of a Mutual Information constrained Generative Adversarial Network as a means to synthesize new alerts from historical data. Histogram Intersection and Conditional Entropy are used to show the performance of this model as well as its ability to learn intricate feature dependencies. The proposed models are able to capture a much wider domain of alert feature values than standard Generative Adversarial Networks. Finally, we show that when looking at alerts from the perspective of attack stages, the proposed models are able to capture critical attacker behavior providing direct semantic meaning to generated samples. 
    more » « less
  5. Detection of malicious behavior is a fundamental problem in security. One of the major challenges in using detection systems in practice is in dealing with an overwhelming number of alerts that are triggered by normal behavior (the so-called false positives), obscuring alerts resulting from actual malicious activity. While numerous methods for reducing the scope of this issue have been proposed, ultimately one must still decide how to prioritize which alerts to investigate, and most existing prioritization methods are heuristic, for example, based on suspiciousness or priority scores. We introduce a novel approach for computing a policy for prioritizing alerts using adversarial reinforcement learning. Our approach assumes that the attackers know the full state of the detection system and dynamically choose an optimal attack as a function of this state, as well as of the alert prioritization policy. The first step of our approach is to capture the interaction between the defender and attacker in a game theoretic model. To tackle the computational complexity of solving this game to obtain a dynamic stochastic alert prioritization policy, we propose an adversarial reinforcement learning framework. In this framework, we use neural reinforcement learning to compute best response policies for both the defender and the adversary to an arbitrary stochastic policy of the other. We then use these in a double-oracle framework to obtain an approximate equilibrium of the game, which in turn yields a robust stochastic policy for the defender. Extensive experiments using case studies in fraud and intrusion detection demonstrate that our approach is effective in creating robust alert prioritization policies. 
    more » « less