Investigating attacks across multiple hosts is challenging. The true dependencies between security-sensitive files, network endpoints, or memory objects from different hosts can be easily concealed by dependency explosion or undefined program behavior (e.g., memory corruption). Dynamic information flow tracking (DIFT) is a potential solution to this problem, but, existing DIFT techniques only track information flow within a single host and lack an efficient mechanism to maintain and synchronize the data flow tags globally across multiple hosts.
In this paper, we propose RTAG, an efficient data flow tagging and tracking mechanism that enables practical cross-host attack investigations. RTAG is based on three novel techniques. First, by using a record-and-replay technique, it decouples the dependencies between different data flow tags from the analysis, enabling lazy synchronization between independent and parallel DIFT instances of different hosts. Second, it takes advantage of systemcall-level provenance information to calculate and allocate the optimal tag map in terms of memory consumption. Third, it embeds tag information into network packets to track cross-host data flows with less than 0.05% network bandwidth overhead. Evaluation results show that RTAG is able to recover the true data flows of realistic cross-host attack scenarios. Performance wise, RTAG reduces the memory consumption of DIFT-based analysis by up to 90% and decreases the overall analysis time by 60%-90% compared with previous investigation systems.
more »
« less
MITOS: Optimal Decisioning for the Indirect Flow Propagation Dilemma in Dynamic Information Flow Tracking Systems
Dynamic Information Flow Tracking (DIFT), also called Dynamic Taint Analysis (DTA), is a technique for tracking the information as it flows through a program's execution. Specifically, some inputs or data get tainted and then these taint marks (tags) propagate usually at the instruction-level. While DIFT has been a fundamental concept in computer and network security for the past decade, it still faces open challenges that impede its widespread application in practice; one of them being the indirect flow propagation dilemma: should the tags involved in an indirect flow, e.g., in a control or address dependency, be propagated?
Propagating all these tags, as is done for direct flows, leads to overtainting (all taintable objects become tainted), while not propagating them leads to undertainting (information flow becomes incomplete). In this paper, we analytically model that decisioning problem for indirect flows, by considering various tradeoffs including undertainting versus overtainting, importance of heterogeneous code semantics and context. Towards tackling this problem, we design MITOS, a distributed-optimization algorithm, that: decides about the propagation of indirect flows by properly weighting all these tradeoffs, is of low-complexity, is scalable, is able to flexibly adapt to different application scenarios and security needs of large distributed systems. Additionally, MITOS is applicable to most DIFT systems that consider an arbitrary number of tag types, and introduces the key properties of fairness and tag-balancing to the DIFT field. To demonstrate MITOS's applicability in practice, we implement and evaluate MITOS on top of an open-source DIFT, and we shed light on the open problem. We also perform a case-study scenario with a real in-memory only attack and show that MITOS improves simultaneously (i) system's spatio-temporal overhead (up to 40%), and (ii) system's fingerprint on suspected bytes (up to 167\%) compared to traditional DIFT, even though these metrics usually conflict.
more »
« less
- Award ID(s):
- 1801599
- PAR ID:
- 10164089
- Date Published:
- Journal Name:
- 40th IEEE International Conference on Distributed Computing Systems
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Dynamic taint tracking, a technique that traces relationships between values as a program executes, has been used to support a variety of software engineering tasks. Some taint tracking systems only consider data flows and ignore control flows. As a result, relationships between some values are not reflected by the analysis. Many applications of taint tracking either benefit from or rely on these relationships being traced, but past works have found that tracking control flows resulted in over-tainting, dramatically reducing the precision of the taint tracking system. In this article, we introduce Conflux , alternative semantics for propagating taint tags along control flows. Conflux aims to reduce over-tainting by decreasing the scope of control flows and providing a heuristic for reducing loop-related over-tainting. We created a Java implementation of Conflux and performed a case study exploring the effect of Conflux on a concrete application of taint tracking, automated debugging. In addition to this case study, we evaluated Conflux ’s accuracy using a novel benchmark consisting of popular, real-world programs. We compared Conflux against existing taint propagation policies, including a state-of-the-art approach for reducing control-flow-related over-tainting, finding that Conflux had the highest F1 score on 43 out of the 48 total tests.more » « less
-
Information flow tracking was proposed more than 40 years ago to address the limitations of access control mechanisms to guarantee the confidentiality and integrity of information flowing within a system, but has not yet been widely applied in practice for security solutions. Here, we survey and systematize literature on dynamic information flow tracking (DIFT) to discover challenges and opportunities to make it practical and effective for security solutions. We focus on common knowledge in the literature and lingering research gaps from two dimensions— (i) the layer of abstraction where DIFT is implemented (software, software/hardware, or hardware) and (ii) the security goal (confidentiality and/or integrity). We observe that two major limitations hinder the practical application of DIFT for on-the-fly security applications: (i) high implementation overhead and (ii) incomplete information flow tracking (low accuracy). We posit, after review of the literature, that addressing these major impedances via hardware parallelism can potentially unleash DIFT’s great potential for systems security, as it can allow security policies to be implemented in a built-in and standardized fashion. Furthermore, we provide recommendations for the next generation of practical and efficient DIFT systems with an eye towards hardware-supported implementations.more » « less
-
Dynamic information-flow tracking (DIFT) is useful for enforcing security policies, but rarely used in practice, as it can slow down a program by an order of magnitude. Static program analyses can be used to prove safe execution states and elide unnecessary DIFT monitors, but the performance improvement from these analyses is limited by their need to maintain soundness. In this paper, we present a novel optimistic hybrid analysis (OHA) to significantly reduce DIFT overhead while still guaranteeing sound results. It consists of a predicated whole-program static taint analysis, which assumes likely invariants gathered from profiles to dramatically improve precision. The optimized DIFT is sound for executions in which those invariants hold true, and recovers to a conservative DIFT for executions in which those invariants are false. We show how to overcome the main problem with using OHA to optimize live executions, which is the possibility of unbounded rollbacks. We eliminate the need for any rollback during recovery by tailoring our predicated static analysis to eliminate only safe elisions of noop monitors. Our tool, Iodine, reduces the overhead of DIFT for enforcing security policies to 9%, which is 4.4x lower than that with traditional hybrid analysis, while still being able to be run on live systems.more » « less
-
Dynamic taint tracking is an information flow analysis that can be applied to many areas of testing. Phosphor is the first portable, accurate and performant dynamic taint tracking system for Java. While previous systems for performing general-purpose taint tracking in the JVM required specialized research JVMs, Phosphor works with standard off-the-shelf JVMs (such as Oracle's HotSpot and OpenJDK's IcedTea). Phosphor also differs from previous portable JVM taint tracking systems that were not general purpose (e.g. tracked only tags on Strings and no other type), in that it tracks tags on all variables. We have also made several enhancements to Phosphor, to track taint tags through control flow (in addition to data flow), as well as to track an arbitrary number of relationships between taint tags (rather than be limited to only 32 tags). In this demonstration, we show how developers writing testing tools can benefit from Phosphor, and explain briefly how to interact with it.more » « less