skip to main content


Title: This is Why We Can’t Cache Nice Things: Lightning-Fast Threat Hunting using Suspicion-Based Hierarchical Storage
Recent advances in the causal analysis can accelerate incident response time, but only after a causal graph of the attack has been constructed. Unfortunately, existing causal graph generation techniques are mainly offline and may take hours or days to respond to investigator queries, creating greater opportunity for attackers to hide their attack footprint, gain persistency, and propagate to other machines. To address that limitation, we present Swift, a threat investigation system that provides high-throughput causality tracking and real-time causal graph generation capabilities. We design an in-memory graph database that enables space-efficient graph storage and online causality tracking with minimal disk operations. We propose a hierarchical storage system that keeps forensically-relevant part of the causal graph in main memory while evicting rest to disk. To identify the causal graph that is likely to be relevant during the investigation, we design an asynchronous cache eviction policy that calculates the most suspicious part of the causal graph and caches only that part in the main memory. We evaluated Swift on a real-world enterprise to demonstrate how our system scales to process typical event loads and how it responds to forensic queries when security alerts occur. Results show that Swift is scalable, modular, and answers forensic queries in real-time even when analyzing audit logs containing tens of millions of events.  more » « less
Award ID(s):
1750024 1657534
NSF-PAR ID:
10232058
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ;
Date Published:
Journal Name:
Annual Computer Security Applications Conference
Page Range / eLocation ID:
165 to 178
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Recent advances in causality analysis have enabled investigators to trace multi-stage attacks using whole- system provenance graphs. Based on system-layer audit logs (e.g., syscalls), these approaches omit vital sources of application context (e.g., email addresses, HTTP response codes) that can found in higher layers of the system. Although this information is often essential to understanding attack behaviors, incorporating this evidence into causal analysis engines is difficult due to the semantic gap that exists between system layers. To address this shortcoming, we propose the notion of universal provenance, which encodes all forensically-relevant causal dependencies regardless of their layer of origin. To transparently realize this vision on commodity systems, we present ωLOG (“Omega Log”), a provenance tracking mechanism that bridges the semantic gap between system and application logging contexts. ωLOG analyzes program binaries to identify and model application-layer logging behaviors, enabling application events to be accurately reconciled with system-layer accesses. ωLOG then intercepts applications’ runtime logging activities and grafts those events onto the system-layer provenance graph, allowing investigators to reason more precisely about the nature of attacks. We demonstrate that ωLOG is widely-applicable to existing software projects and can transparently facilitate execution partitioning of dependency graphs without any training or developer intervention. Evaluation on real-world attack scenarios shows that universal provenance graphs are concise and rich with semantic information as compared to the state-of-the-art, with 12% average runtime overhead. 
    more » « less
  2. Recent advances in causality analysis have enabled investigators to trace multi-stage attacks using provenance graphs. Based on system-layer audit logs (e.g., syscalls), these approaches omit vital sources of application context (e.g., email addresses, HTTP response codes) that can be found in higher layers of the system. Although such information is often essential to understanding attack behaviors, it is difficult to incorporate this evidence into causal analysis engines because of the semantic gap that exists between system layers. To address that shortcoming, we propose the notion of universal provenance, which encodes all forensically relevant causal dependencies regardless of their layer of origin. To transparently realize that vision on commodity systems, we present OmegaLog, a provenance tracker that bridges the semantic gap between system and application logging contexts. OmegaLog analyzes program binaries to identify and model application-layer logging behaviors, enabling accurate reconciliation of application events with system-layer accesses. OmegaLog then intercepts applications’ runtime logging activities and grafts those events onto the system-layer provenance graph, allowing investigators to reason more precisely about the nature of attacks. We demonstrate that our system is widely applicable to existing software projects and can transparently facilitate execution partitioning of provenance graphs without any training or developer intervention. Evaluation on real-world attack scenarios shows that our technique generates concise provenance graphs with rich semantic information relative to the state-of-the-art, with an average runtime overhead of 4% 
    more » « less
  3. System logs are invaluable to forensic audits, but grow so large that in practice fine-grained logs are quickly discarded – if captured at all – preventing the real-world use of the provenance-based investigation techniques that have gained popularity in the literature. Encouragingly, forensically-informed methods for reducing the size of system logs are a subject of frequent study. Unfortunately, many of these techniques are designed for offline reduction in a central server, meaning that the up-front cost of log capture, storage, and transmission must still be paid at the endpoints. Moreover, to date these techniques exist as isolated (and, often, closed-source) implementations; there does not exist a comprehensive framework through which the combined benefits of multiple log reduction techniques can be enjoyed. In this work, we present FAuST, an audit daemon for performing streaming audit log reduction at system endpoints. After registering with a log source (e.g., via Linux Audit’s audisp utility), FAuST incrementally builds an in-memory provenance graph of recent system activity. During graph construction, log reduction techniques that can be applied to local subgraphs are invoked immediately using event callback handlers, while techniques meant for application on the global graph are invoked in periodic epochs. We evaluate FAuST, loaded with eight different log reduction modules from the literature, against the DARPA Transparent Computing datasets. Our experiments demonstrate the efficient performance of FAuST and identify certain subsets of reduction techniques that are synergistic with one another. Thus, FAuST dramatically simplifies the evaluation and deployment of log reduction techniques. 
    more » « less
  4. null (Ed.)
    Auditing is an increasingly essential tool for the defense of computing systems, but the unwieldy nature of log data imposes significant burdens on administrators and analysts. To address this issue, a variety of techniques have been proposed for approximating the contents of raw audit logs, facilitating efficient storage and analysis. However, the security value of these approximated logs is difficult to measure—relative to the original log, it is unclear if these techniques retain the forensic evidence needed to effectively investigate threats. Unfortunately, prior work has only investigated this issue anecdotally, demonstrating sufficient evidence is retained for specific attack scenarios. In this work, we address this gap in the literature through formalizing metrics for quantifying the forensic validity of an approximated audit log under differing threat models. In addition to providing quantifiable security arguments for prior work, we also identify a novel point in the approximation design space—that log events describing typical (benign) system activity can be aggressively approximated, while events that encode anomalous behavior should be preserved with lossless fidelity. We instantiate this notion of Attack-Preserving forensic validity in LogApprox, a new approximation technique that eliminates the redundancy of voluminous file I/O associated with benign process activities. We evaluate LogApprox alongside a corpus of exemplar approximation techniques from prior work and demonstrate that LogApprox achieves comparable log reduction rates while retaining 100% of attack-identifying log events. Additionally, we utilize this evaluation to illuminate the inherent trade-off between performance and utility within existing approximation techniques. This work thus establishes trustworthy foundations for the design of the next generation of efficient auditing frameworks. 
    more » « less
  5. Location information is critical to a wide variety of navigation and tracking applications. GPS, today's de-facto outdoor localization system has been shown to be vulnerable to signal spoofing attacks. Inertial Navigation Systems (INS) are emerging as a popular complementary system, especially in road transportation systems as they enable improved navigation and tracking as well as offer resilience to wireless signals spoofing and jamming attacks. In this paper, we evaluate the security guarantees of INS-aided GPS tracking and navigation for road transportation systems. We consider an adversary required to travel from a source location to a destination and monitored by an INS-aided GPS system. The goal of the adversary is to travel to alternate locations without being detected. We develop and evaluate algorithms that achieve this goal, providing the adversary significant latitude. Our algorithms build a graph model for a given road network and enable us to derive potential destinations an attacker can reach without raising alarms even with the INS-aided GPS tracking and navigation system. The algorithms render the gyroscope and accelerometer sensors useless as they generate road trajectories indistinguishable from plausible paths (both in terms of turn angles and roads curvature). We also design, build and demonstrate that the magnetometer can be actively spoofed using a combination of carefully controlled coils. To experimentally demonstrate and evaluate the feasibility of the attack in real-world, we implement a first real-time integrated GPS/INS spoofer that accounts for traffic fluidity, congestion, lights, and dynamically generates corresponding spoofing signals. Furthermore, we evaluate our attack on ten different cities using driving traces and publicly available city plans. Our evaluations show that it is possible for an attacker to reach destinations that are as far as 30 km away from the actual destination without being detected. We also show that it is possible for the adversary to reach almost 60--80% of possible points within the target region in some cities. Such results are only a lower-bound, as an adversary can adjust our parameters to spend more resources (e.g., time) on the target source/destination than we did for our performance evaluations of thousands of paths. We propose countermeasures that limit an attacker's ability, without the need for any hardware modifications. Our system can be used as the foundation for countering such attacks, both detecting and recommending paths that are difficult to spoof. 
    more » « less