Large enterprises are increasingly relying on threat detection softwares (e.g., Intrusion Detection Systems) to allow them to spot suspicious activities. These softwares generate alerts which must be investigated by cyber analysts to figure out if they are true attacks. Unfortunately, in practice, there are more alerts than cyber analysts can properly investigate. This leads to a “threat alert fatigue” or information overload problem where cyber analysts miss true attack alerts in the noise of false alarms. In this paper, we present NoDoze to combat this challenge using contextual and historical information of generated threat alert in an enterprise. NoDoze first generates a causal dependency graph of an alert event. Then, it assigns an anomaly score to each event in the dependency graph based on the frequency with which related events have happened before in the enterprise. NoDoze then propagates those scores along the edges of the graph using a novel network diffusion algorithm and generates a subgraph with an aggregate anomaly score which is used to triage alerts. Evaluation on our dataset of 364 threat alerts shows that NoDoze decreases the volume of false alarms by 86%, saving more than 90 hours of analysts’ time, which was required to investigate those false alarms. Furthermore, NoDoze generated dependency graphs of true alerts are 2 orders of magnitude smaller than those generated by traditional tools without sacrificing the vital information needed for the investigation. Our system has a low average runtime overhead and can be deployed with any threat detection software. 
                        more » 
                        « less   
                    
                            
                            CloudShield: Real-time Anomaly Detection in the Cloud
                        
                    
    
            In cloud computing, it is desirable if suspicious activities can be detected by automatic anomaly detection systems. Although anomaly detection has been investigated in the past, it remains unsolved in cloud computing. Challenges are: characterizing the normal behavior of a cloud server, distinguishing between benign and malicious anomalies (attacks), and preventing alert fatigue due to false alarms. We propose CloudShield, a practical and generalizable real-time anomaly and attack detection system for cloud computing. Cloudshield uses a general, pretrained deep learning model with different cloud workloads, to predict the normal behavior and provide real-time and continuous detection by examining the model reconstruction error distributions. Once an anomaly is detected, to reduce alert fatigue, CloudShield automatically distinguishes between benign programs, known attacks, and zero-day attacks, by examining the prediction error distributions. We evaluate the proposed CloudShield on representative cloud benchmarks. Our evaluation shows that CloudShield, using model pretraining, can apply to a wide scope of cloud workloads. Especially, we observe that CloudShield can detect the recently proposed speculative execution attacks, e.g., Spectre and Meltdown attacks, in milliseconds. Furthermore, we show that CloudShield accurately differentiates and prioritizes known attacks, and potential zero-day attacks, from benign programs. Thus, it significantly reduces false alarms by up to 99.0%. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 1814190
- PAR ID:
- 10392200
- Date Published:
- Journal Name:
- arXiv Computer Science Cryptography and Security
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            null (Ed.)Controllers of security-critical cyber-physical systems, like the power grid, are a very important class of computer systems. Attacks against the control code of a power-grid system, especially zero-day attacks, can be catastrophic. Earlier detection of the anomalies can prevent further damage. However, detecting zero-day attacks is extremely challenging because they have no known code and have unknown behavior. Furthermore, if data collected from the controller is transferred to a server through networks for analysis and detection of anomalous behavior, this creates a very large attack surface and also delays detection. In order to address this problem, we propose Reconstruction Error Distribution (RED) of Hardware Performance Counters (HPCs), and a data-driven defense system based on it. Specifically, we first train a temporal deep learning model, using only normal HPC readings from legitimate processes that run daily in these power-grid systems, to model the normal behavior of the power-grid controller. Then, we run this model using real-time data from commonly available HPCs. We use the proposed RED to enhance the temporal deep learning detection of anomalous behavior, by estimating distribution deviations from the normal behavior with an effective statistical test. Experimental results on a real power-grid controller show that we can detect anomalous behavior with high accuracy (>99.9%), nearly zero false positives and short (<360ms) latency.more » « less
- 
            null (Ed.)Network intrusion detection systems (IDS) has efficiently identified the profiles of normal network activities, extracted intrusion patterns, and constructed generalized models to evaluate (un)known attacks using a wide range of machine learning approaches. In spite of the effectiveness of machine learning-based IDS, it has been still challenging to reduce high false alarms due to data misclassification. In this paper, by using multiple decision mechanisms, we propose a new classification method to identify misclassified data and then to classify them into three different classes, called a malicious, benign, and ambiguous dataset. In other words, the ambiguous dataset contains a majority of the misclassified dataset and is thus the most informative for improving the model and anomaly detection because of the lack of confidence for the data classification in the model. We evaluate our approach with the recent real-world network traffic data, Kyoto2006+ datasets, and show that the ambiguous dataset contains 77.2% of the previously misclassified data. Re-evaluating the ambiguous dataset effectively reduces the false prediction rate with minimal overhead and improves accuracy by 15%.more » « less
- 
            Anomaly-based attack detection methods that rely on learning the benign profile of operation are commonly used for identifying data falsification attacks and faults in cyber-physical systems. However, most works do not assume the presence of attacks while training the anomaly detectors- and their impact on eventual anomaly detection performance during the test set. Some robust learning methods overcompensate mitigation which leads to increased false positives in the absence of attacks/threats during training. To achieve this balance, this paper proposes a framework to enhance the robustness of previous anomaly detection frameworks in smart living applications, by introducing three profound design changes for threshold learning of time series anomaly detectors:(1) Tukey bi-weight loss function instead of square loss function (2) adding quantile weights to regression errors of Tukey (3) modifying the definition of empirical cost function from MSE to the harmonic mean of quantile weighted Tukey losses. We show that these changes mitigate performance degradation in anomaly detectors caused by untargeted poisoning attacks during training- while is simultaneously able to prevent false alarms in the absence of such training set attacks. We evaluate our work using a proof of concept that uses state-of-the-art anomaly detection in smart living CPS that detects false data injection in smart metering.more » « less
- 
            Smart water metering (SWM) infrastructure collects real-time water usage data that is useful for automated billing, leak detection, and forecasting of peak periods. Cyber/physical attacks can lead to data falsification on water usage data. This paper proposes a learning approach that converts smart water meter data into a Pythagorean mean-based invariant that is highly stable under normal conditions but deviates under attacks. We show how adversaries can launch deductive or camouflage attacks in the SWM infrastructure to gain benefits and impact the water distribution utility. Then, we apply a two-tier approach of stateless and stateful detection, reducing false alarms without significantly sacrificing the attack detection rate. We validate our approach using real-world water usage data of 92 households in Alicante, Spain for varying attack scales and strengths and prove that our method limits the impact of undetected attacks and expected time between consecutive false alarms. Our results show that even for low-strength, low-scale deductive attacks, the model limits the impact of an undetected attack to only 0.2199375 pounds and for high-strength, low-scale camouflage attack, the impact of an undetected attack was limited to 1.434375 pounds.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    