Dynamic Information Flow Tracking (DIFT), also called Dynamic Taint Analysis (DTA), is a technique for tracking the information as it flows through a program's execution. Specifically, some inputs or data get tainted and then these taint marks (tags) propagate usually at the instruction-level. While DIFT has been a fundamental concept in computer and network security for the past decade, it still faces open challenges that impede its widespread application in practice; one of them being the indirect flow propagation dilemma: should the tags involved in an indirect flow, e.g., in a control or address dependency, be propagated? Propagating all these tags, as is done for direct flows, leads to overtainting (all taintable objects become tainted), while not propagating them leads to undertainting (information flow becomes incomplete). In this paper, we analytically model that decisioning problem for indirect flows, by considering various tradeoffs including undertainting versus overtainting, importance of heterogeneous code semantics and context. Towards tackling this problem, we design MITOS, a distributed-optimization algorithm, that: decides about the propagation of indirect flows by properly weighting all these tradeoffs, is of low-complexity, is scalable, is able to flexibly adapt to different application scenarios and security needs of large distributed systems. Additionally, MITOS is applicable to most DIFT systems that consider an arbitrary number of tag types, and introduces the key properties of fairness and tag-balancing to the DIFT field. To demonstrate MITOS's applicability in practice, we implement and evaluate MITOS on top of an open-source DIFT, and we shed light on the open problem. We also perform a case-study scenario with a real in-memory only attack and show that MITOS improves simultaneously (i) system's spatio-temporal overhead (up to 40%), and (ii) system's fingerprint on suspected bytes (up to 167\%) compared to traditional DIFT, even though these metrics usually conflict.
more »
« less
Dynamic taint tracking for Java with phosphor (demo)
Dynamic taint tracking is an information flow analysis that can be applied to many areas of testing. Phosphor is the first portable, accurate and performant dynamic taint tracking system for Java. While previous systems for performing general-purpose taint tracking in the JVM required specialized research JVMs, Phosphor works with standard off-the-shelf JVMs (such as Oracle's HotSpot and OpenJDK's IcedTea). Phosphor also differs from previous portable JVM taint tracking systems that were not general purpose (e.g. tracked only tags on Strings and no other type), in that it tracks tags on all variables. We have also made several enhancements to Phosphor, to track taint tags through control flow (in addition to data flow), as well as to track an arbitrary number of relationships between taint tags (rather than be limited to only 32 tags). In this demonstration, we show how developers writing testing tools can benefit from Phosphor, and explain briefly how to interact with it.
more »
« less
- PAR ID:
- 10112170
- Date Published:
- Journal Name:
- ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA)
- Page Range / eLocation ID:
- 409 to 413
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Dynamic taint tracking, a technique that traces relationships between values as a program executes, has been used to support a variety of software engineering tasks. Some taint tracking systems only consider data flows and ignore control flows. As a result, relationships between some values are not reflected by the analysis. Many applications of taint tracking either benefit from or rely on these relationships being traced, but past works have found that tracking control flows resulted in over-tainting, dramatically reducing the precision of the taint tracking system. In this article, we introduce Conflux , alternative semantics for propagating taint tags along control flows. Conflux aims to reduce over-tainting by decreasing the scope of control flows and providing a heuristic for reducing loop-related over-tainting. We created a Java implementation of Conflux and performed a case study exploring the effect of Conflux on a concrete application of taint tracking, automated debugging. In addition to this case study, we evaluated Conflux ’s accuracy using a novel benchmark consisting of popular, real-world programs. We compared Conflux against existing taint propagation policies, including a state-of-the-art approach for reducing control-flow-related over-tainting, finding that Conflux had the highest F1 score on 43 out of the 48 total tests.more » « less
-
Investigating attacks across multiple hosts is challenging. The true dependencies between security-sensitive files, network endpoints, or memory objects from different hosts can be easily concealed by dependency explosion or undefined program behavior (e.g., memory corruption). Dynamic information flow tracking (DIFT) is a potential solution to this problem, but, existing DIFT techniques only track information flow within a single host and lack an efficient mechanism to maintain and synchronize the data flow tags globally across multiple hosts. In this paper, we propose RTAG, an efficient data flow tagging and tracking mechanism that enables practical cross-host attack investigations. RTAG is based on three novel techniques. First, by using a record-and-replay technique, it decouples the dependencies between different data flow tags from the analysis, enabling lazy synchronization between independent and parallel DIFT instances of different hosts. Second, it takes advantage of systemcall-level provenance information to calculate and allocate the optimal tag map in terms of memory consumption. Third, it embeds tag information into network packets to track cross-host data flows with less than 0.05% network bandwidth overhead. Evaluation results show that RTAG is able to recover the true data flows of realistic cross-host attack scenarios. Performance wise, RTAG reduces the memory consumption of DIFT-based analysis by up to 90% and decreases the overall analysis time by 60%-90% compared with previous investigation systems.more » « less
-
Dynamic information-flow tracking (DIFT) is useful for enforcing security policies, but rarely used in practice, as it can slow down a program by an order of magnitude. Static program analyses can be used to prove safe execution states and elide unnecessary DIFT monitors, but the performance improvement from these analyses is limited by their need to maintain soundness. In this paper, we present a novel optimistic hybrid analysis (OHA) to significantly reduce DIFT overhead while still guaranteeing sound results. It consists of a predicated whole-program static taint analysis, which assumes likely invariants gathered from profiles to dramatically improve precision. The optimized DIFT is sound for executions in which those invariants hold true, and recovers to a conservative DIFT for executions in which those invariants are false. We show how to overcome the main problem with using OHA to optimize live executions, which is the possibility of unbounded rollbacks. We eliminate the need for any rollback during recovery by tailoring our predicated static analysis to eliminate only safe elisions of noop monitors. Our tool, Iodine, reduces the overhead of DIFT for enforcing security policies to 9%, which is 4.4x lower than that with traditional hybrid analysis, while still being able to be run on live systems.more » « less
-
Hardware enclaves are designed to execute small pieces of sensitive code or to operate on sensitive data, in isolation from larger, less trusted systems. Partitioning a large, legacy application requires significant effort. Partitioning an application written in a managed language, such as Java, is more challenging because of mutable language characteristics, extensive code reachability in class libraries, and the inevitability of using a heavyweight runtime. Civet is a framework for partitioning Java applications into enclaves. Civet reduces the number of lines of code in the enclave and uses language-level defenses, including deep type checks and dynamic taint-tracking, to harden the enclave interface. Civet also contributes a partitioned Java runtime design, including a garbage collection design optimized for the peculiarities of enclaves. Civet is efficient for data-intensive workloads; partitioning a Hadoop mapper reduces the enclave overhead from 10 to 16–22% without taint-tracking or 70–80% with taint-tracking.more » « less
An official website of the United States government

