skip to main content


Title: A Novel AI-based Methodology for Identifying Cyber Attacks in Honey Pots
e present a novel AI-based methodology that identifies phases of a host-level cyber attack simply from system call logs. System calls emanating from cyber attacks on hosts such as honey pots are often recorded in audit logs. Our methodology first involves efficiently loading, caching, processing, and querying system events contained in audit logs in support of computer forensics. Output of queries remains at the system call level and is difficult to process. The next step is to infer a sequence of abstracted actions, which we colloquially call a storyline, from the system calls given as observations to a latent-state probabilistic model. These storylines are then accurately identified with class labels using a learned classifier. We qualitatively and quantitatively evaluate methods and models for each step of the methodology using 114 different attack phases collected by logging the attacks of a red team on a server, on some likely benign sequences containing regular user activities, and on traces from a recent DARPA project. The resulting end-to-end system, which we call Cyberian, identifies the attack phases with a high level of accuracy illustrating the benefit that this machine learning-based methodology brings to security forensics.  more » « less
Award ID(s):
1916550
NSF-PAR ID:
10298208
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
Proceedings of the AAAI Conference on Artificial Intelligence
ISSN:
2374-3468
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Provenance-based causal analysis of audit logs has proven to be an invaluable method of investigating system intrusions. However, it also suffers from dependency explosion, whereby long-running processes accumulate many dependencies that are hard to unravel. Execution unit partitioning addresses this by segmenting dependencies into units of work, such as isolating the events that processed a single HTTP request. Unfortunately, we discover that current designs have a semantic gap problem due to how system calls and application log messages are used to infer complex internal program states. We demonstrate how attackers can modify existing code exploits to control event partitioning, breaking links in the attack and framing innocent users. We also show how our techniques circumvent existing program and log integrity defenses. We then propose a new design for execution unit partitioning that leverages additional runtime data to yield verified partitions that resist manipulation. Our design overcomes the technical challenges of minimizing additional overhead while accurately connecting low level code instructions to high level audit events, in part with the use of commodity hardware processor tracing. We implement a prototype of our design for Linux, MARSARA, and extensively evaluate it on 14 real-world programs, targeted with expertly crafted exploits. MARSARA's verified partitions successfully capture all the attack provenances while only reintroducing 2.82% of false dependencies, in the worst case, with an average overhead of 8.7%. Using a new metric called Partitioning Attack Surface, we show that MARSARA eliminates 47,642 more repartitioning gadgets per program than integrity defenses like CFI, demonstrating our prototype's effectiveness and the novelty of the attacks it prevents. 
    more » « less
  2. Investigating the nature of system intrusions in large distributed systems remains a notoriously difficult challenge. While monitoring tools (e.g., Firewalls, IDS) provide preliminary alerts through easy-to-use administrative interfaces, attack reconstruction still requires that administrators sift through gigabytes of system audit logs stored locally on hundreds of machines. At present, two fundamental obstacles prevent synergy between system-layer auditing and modern cluster monitoring tools: 1) the sheer volume of audit data generated in a data center is prohibitively costly to transmit to a central node, and 2) system- layer auditing poses a “needle-in-a-haystack” problem, such that hundreds of employee hours may be required to diagnose a single intrusion. This paper presents Winnower, a scalable system for audit-based cluster monitoring that addresses these challenges. Our key insight is that, for tasks that are replicated across nodes in a distributed application, a model can be defined over audit logs to succinctly summarize the behavior of many nodes, thus eliminating the need to transmit redundant audit records to a central monitoring node. Specifically, Winnower parses audit records into provenance graphs that describe the actions of individual nodes, then performs grammatical inference over individual graphs using a novel adaptation of Deterministic Finite Automata (DFA) Learning to produce a behavioral model of many nodes at once. This provenance model can be efficiently transmitted to a central node and used to identify anomalous events in the cluster. We have implemented Winnower for Docker Swarm container clusters and evaluate our system against real-world applications and attacks. We show that Winnower dramatically reduces storage and network overhead associated with aggregating system audit logs, by as much as 98%, without sacrificing the important information needed for attack investigation. Winnower thus represents a significant step forward for security monitoring in distributed systems. 
    more » « less
  3. null (Ed.)
    Defense mechanisms against network-level attacks are commonly based on the use of cryptographic techniques, such as lengthy message authentication codes (MAC) that provide data integrity guarantees. However, such mechanisms require significant resources (both computational and network bandwidth), which prevents their continuous use in resource-constrained cyber-physical systems (CPS). Recently, it was shown how physical properties of controlled systems can be exploited to relax these stringent requirements for systems where sensor measurements and actuator commands are transmitted over a potentially compromised network; specifically, that merely intermittent use of data authentication (i.e., at occasional time points during system execution), can still provide strong Quality-of-Control (QoC) guarantees even in the presence of false-data injection attacks, such as Man-in-the-Middle (MitM) attacks. Consequently, in this work, we focus on integrating security into existing resource-constrained CPS, in order to protect against MitM attacks on a system where a set of control tasks communicates over a real-time network with system sensors and actuators. We introduce a design-time methodology that incorporates requirements for QoC in the presence of attacks into end-to-end timing constraints for real-time control transactions, which include data acquisition and authentication, real-time network messages, and control tasks. This allows us to formulate a mixed integer linear programming-based method for direct synthesis of schedulable tasks and message parameters (i.e., deadlines and offsets) that do not violate timing requirements for the already deployed controllers, while adding a sufficient level of protection against network-based attacks; specifically, the synthesis method also provides suitable intermittent authentication policies that ensure the desired QoC levels under attack. To additionally reduce the security-related bandwidth overhead, we propose the use of cumulative message authentication at time instances when the integrity of messages from subsets of sensors should be ensured. Furthermore, we introduce a method for the opportunistic use of the remaining resources to further improve the overall QoC guarantees while ensuring system (i.e., task and message) schedulability. Finally, we demonstrate applicability and scalability of our methodology on synthetic automotive systems as well as a real-world automotive case-study. 
    more » « less
  4. System auditing is a central concern when investigating and responding to security incidents. Unfortunately, attackers regularly engage in anti-forensic activities after a break-in, covering their tracks from the system logs in order to frustrate the efforts of investigators. While a variety of tamper-evident logging solutions have appeared throughout the industry and the literature, these techniques do not meet the operational and scalability requirements of system-layer audit frameworks. In this work, we introduce Custos, a practical framework for the detection of tampering in system logs. Custos consists of a tamper-evident logging layer and a decentralized auditing protocol. The former enables the verification of log integrity with minimal changes to the underlying logging framework, while the latter enables near real-time detection of log integrity violations within an enterprise-class network. Custos is made practical by the observation that we can decouple the costs of cryptographic log commitments from the act of creating and storing log events, without trading off security, leveraging features of off-the-shelf trusted execution environments. Supporting over one million events per second, we show that Custos' tamper-evident logging protocol is three orders of magnitude (1000×) faster than prior solutions and incurs only between 2% and 7% runtime overhead over insecure logging on intensive workloads. Further, we show that Custos' auditing protocol can detect violations in near real-time even in the presence of a powerful distributed adversary and with minimal (3%) network overhead. Our case study on a real-world APT attack scenario demonstrates that Custos forces anti-forensic attackers into a "lose-lose" situation, where they can either be covert and not tamper with logs (which can be used for forensics), or erase logs but then be detected by Custos. 
    more » « less
  5. Recent advances in causality analysis have enabled investigators to trace multi-stage attacks using whole- system provenance graphs. Based on system-layer audit logs (e.g., syscalls), these approaches omit vital sources of application context (e.g., email addresses, HTTP response codes) that can found in higher layers of the system. Although this information is often essential to understanding attack behaviors, incorporating this evidence into causal analysis engines is difficult due to the semantic gap that exists between system layers. To address this shortcoming, we propose the notion of universal provenance, which encodes all forensically-relevant causal dependencies regardless of their layer of origin. To transparently realize this vision on commodity systems, we present ωLOG (“Omega Log”), a provenance tracking mechanism that bridges the semantic gap between system and application logging contexts. ωLOG analyzes program binaries to identify and model application-layer logging behaviors, enabling application events to be accurately reconciled with system-layer accesses. ωLOG then intercepts applications’ runtime logging activities and grafts those events onto the system-layer provenance graph, allowing investigators to reason more precisely about the nature of attacks. We demonstrate that ωLOG is widely-applicable to existing software projects and can transparently facilitate execution partitioning of dependency graphs without any training or developer intervention. Evaluation on real-world attack scenarios shows that universal provenance graphs are concise and rich with semantic information as compared to the state-of-the-art, with 12% average runtime overhead. 
    more » « less