skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: NODLINK: An Online System for Fine-Grained APT Attack Detection and Investigation
Advanced Persistent Threats (APT) attacks have plagued modern enterprises, causing significant financial losses. To counter these attacks, researchers propose techniques that capture the complex and stealthy scenarios of APT attacks by using provenance graphs to model system entities and their dependencies. Particularly, to accelerate attack detection and reduce financial losses, online provenance-based detection systems that detect and investigate APT attacks under the constraints of timeliness and limited resources are in dire need. Unfortunately, existing online systems usually sacrifice detection granularity to reduce computational complexity and produce provenance graphs with more than 100,000 nodes, posing challenges for security admins to interpret the detection results. In this paper, we design and implement NODLINK, the first online detection system that maintains high detection accuracy without sacrificing detection granularity. Our insight is that the APT attack detection process in online provenance-based detection systems can be modeled as a Steiner Tree Problem (STP), which has efficient online approximation algorithms that recover concise attack-related provenance graphs with a theoretically bounded error. To utilize the frameworks of the STP approximation algorithm for APT attack detection, we propose a novel design of in-memory cache, an efficient attack screening method, and a new STP approximation algorithm that is more efficient than the conventional one in APT attack detection while maintaining the same complexity. We evaluate NODLINK in a production environment. The openworld experiment shows that NODLINK outperforms two state-ofthe- art (SOTA) online provenance analysis systems by achieving magnitudes higher detection and investigation accuracy while having the same or higher throughput.  more » « less
Award ID(s):
2438197
PAR ID:
10643355
Author(s) / Creator(s):
; ; ; ; ; ; ; ;
Publisher / Repository:
Internet Society
Date Published:
ISBN:
1-891562-93-2
Format(s):
Medium: X
Location:
San Diego, CA, USA
Sponsoring Org:
National Science Foundation
More Like this
  1. Advanced Persistent Threats (APTs) are difficult to detect due to their “low-and-slow” attack patterns and frequent use of zero-day exploits. We present UNICORN, an anomaly-based APT detector that effectively leverages data provenance analysis. From modeling to detection, UNICORN tailors its design specifically for the unique characteristics of APTs. Through extensive yet time-efficient graph analysis, UNICORN explores provenance graphs that provide rich contextual and historical information to identify stealthy anomalous activities without pre-defined attack signatures. Using a graph sketching technique, it summarizes long-running system execution with space efficiency to combat slow-acting attacks that take place over a long time span. UNICORN further improves its detection capability using a novel modeling approach to understand long-term behavior as the system evolves. Our evaluation shows that UNICORN outperforms an existing state-of-the-art APT detection system and detects real-life APT scenarios with high accuracy. 
    more » « less
  2. Intrusion detection systems are a commonly deployed defense that examines network traffic, host operations, or both to detect attacks. However, more attacks bypass IDS defenses each year, and with the sophistication of attacks increasing as well, we must examine new perspectives for intrusion detection. Current intrusion detection systems focus on known attacks and/or vulnerabilities, limiting their ability to identify new attacks, and lack the visibility into all system components necessary to confirm attacks accurately, particularly programs. To change the landscape of intrusion detection, we propose that future IDSs track how attacks evolve across system layers by adapting the concept of attack graphs. Attack graphs were proposed to study how multi-stage attacks could be launched by exploiting known vulnerabilities. Instead of constructing attacks reactively, we propose to apply attack graphs proactively to detect sequences of events that fulfill the requirements for vulnerability exploitation. Using this insight, we examine how to generate modular attack graphs automatically that relate adversary accessibility for each component, called its attack surface, to flaws that provide adversaries with permissions that create threats, called attack states, and exploit operations from those threats, called attack actions. We evaluate the proposed approach by applying it to two case studies: (1) attacks on file retrieval, such as TOCTTOU attacks, and (2) attacks propagated among processes, such as attacks on Shellshock vulnerabilities. In these case studies, we demonstrate how to leverage existing tools to compute attack graphs automatically and assess the effectiveness of these tools for building complete attack graphs. While we identify some research areas, we also find several reasons why attack graphs can provide a valuable foundation for improving future intrusion detection systems. 
    more » « less
  3. Recent advances in causality analysis have enabled investigators to trace multi-stage attacks using provenance graphs. Based on system-layer audit logs (e.g., syscalls), these approaches omit vital sources of application context (e.g., email addresses, HTTP response codes) that can be found in higher layers of the system. Although such information is often essential to understanding attack behaviors, it is difficult to incorporate this evidence into causal analysis engines because of the semantic gap that exists between system layers. To address that shortcoming, we propose the notion of universal provenance, which encodes all forensically relevant causal dependencies regardless of their layer of origin. To transparently realize that vision on commodity systems, we present OmegaLog, a provenance tracker that bridges the semantic gap between system and application logging contexts. OmegaLog analyzes program binaries to identify and model application-layer logging behaviors, enabling accurate reconciliation of application events with system-layer accesses. OmegaLog then intercepts applications’ runtime logging activities and grafts those events onto the system-layer provenance graph, allowing investigators to reason more precisely about the nature of attacks. We demonstrate that our system is widely applicable to existing software projects and can transparently facilitate execution partitioning of provenance graphs without any training or developer intervention. Evaluation on real-world attack scenarios shows that our technique generates concise provenance graphs with rich semantic information relative to the state-of-the-art, with an average runtime overhead of 4% 
    more » « less
  4. Abstract Early attack detection is essential to ensure the security of complex networks, especially those in critical infrastructures. This is particularly crucial in networks with multi-stage attacks, where multiple nodes are connected to external sources, through which attacks could enter and quickly spread to other network elements. Bayesian attack graphs (BAGs) are powerful models for security risk assessment and mitigation in complex networks, which provide the probabilistic model of attackers’ behavior and attack progression in the network. Most attack detection techniques developed for BAGs rely on the assumption that network compromises will be detected through routine monitoring, which is unrealistic given the ever-growing complexity of threats. This paper derives the optimal minimum mean square error (MMSE) attack detection and monitoring policy for the most general form of BAGs. By exploiting the structure of BAGs and their partial and imperfect monitoring capacity, the proposed detection policy achieves the MMSE optimality possible only for linear-Gaussian state space models using Kalman filtering. An adaptive resource monitoring policy is also introduced for monitoring nodes if the expected predictive error exceeds a user-defined value. Exact and efficient matrix-form computations of the proposed policies are provided, and their high performance is demonstrated in terms of the accuracy of attack detection and the most efficient use of available resources using synthetic Bayesian attack graphs with different topologies. 
    more » « less
  5. Recent advances in causality analysis have enabled investigators to trace multi-stage attacks using whole- system provenance graphs. Based on system-layer audit logs (e.g., syscalls), these approaches omit vital sources of application context (e.g., email addresses, HTTP response codes) that can found in higher layers of the system. Although this information is often essential to understanding attack behaviors, incorporating this evidence into causal analysis engines is difficult due to the semantic gap that exists between system layers. To address this shortcoming, we propose the notion of universal provenance, which encodes all forensically-relevant causal dependencies regardless of their layer of origin. To transparently realize this vision on commodity systems, we present ωLOG (“Omega Log”), a provenance tracking mechanism that bridges the semantic gap between system and application logging contexts. ωLOG analyzes program binaries to identify and model application-layer logging behaviors, enabling application events to be accurately reconciled with system-layer accesses. ωLOG then intercepts applications’ runtime logging activities and grafts those events onto the system-layer provenance graph, allowing investigators to reason more precisely about the nature of attacks. We demonstrate that ωLOG is widely-applicable to existing software projects and can transparently facilitate execution partitioning of dependency graphs without any training or developer intervention. Evaluation on real-world attack scenarios shows that universal provenance graphs are concise and rich with semantic information as compared to the state-of-the-art, with 12% average runtime overhead. 
    more » « less