Binary program dependence analysis determines dependence between instructions and hence is important for many applications that have to deal with executables without any symbol information. A key challenge is to identify if multiple memory read/write instructions access the same memory location. The state-of-the-art solution is the value set analysis (VSA) that uses abstract interpretation to determine the set of addresses that are possibly accessed by memory instructions. However, VSA is conservative and hence leads to a large number of bogus dependences and then substantial false positives in downstream analyses such as malware behavior analysis. Furthermore, existing public VSA implementations have difficulty scaling to complex binaries. In this paper, we propose a new binary dependence analysis called BDA enabled by a randomized abstract interpretation technique. It features a novel whole program path sampling algorithm that is not biased by path length, and a per-path abstract interpretation avoiding precision loss caused by merging paths in traditional analyses. It also provides probabilistic guarantees. Our evaluation on SPECINT2000 programs shows that it can handle complex binaries such as gcc whereas VSA implementations from the-state-of-art platforms have difficulty producing results for many SPEC binaries. In addition, the dependences reported by BDA are 75 and 6 times smaller than Alto, a scalable binary dependence analysis tool, and VSA, respectively, with only 0.19% of true dependences observed during dynamic execution missed (by BDA). Applying BDA to call graph generation and malware analysis shows that BDA substantially supersedes the commercial tool IDA in recovering indirect call targets and outperforms a state-of-the-art malware analysis tool Cuckoo by disclosing 3 times more hidden payloads.
more »
« less
Practical Program Modularization with Type-Based Dependence Analysis
Today’s software programs are bloating and have become extremely complex. As there is typically no internal isolation among modules in a program, a vulnerability can be exploited to corrupt the memory and take control of the whole program. Program modularization is thus a promising security mechanism that splits a complex program into smaller modules, so that memory-access instructions can be constrained from corrupting irrelevant modules. A general approach to realizing program modularization is dependence analysis which determines if an instruction is independent of specific code or data; and if so, it can be modularized. Unfortunately, dependence analysis in complex programs is generally considered infeasible, due to problems in data-flow analysis, such as unknown indirect-call targets, pointer aliasing, and path explosion. As a result, we have not seen practical automated program modularization built on dependence analysis. This paper presents a breakthrough---Type-based dependence analysis for Program Modularization (TyPM). Its goal is to determine which modules in a program can never pass a type of object (including references) to a memory-access instruction; therefore, objects of this type that are created by these modules can never be valid targets of the instruction. The idea is to employ a type-based analysis to first determine which types of data flows can take place between two modules, and then transitively resolve all dependent modules of a memory-access instruction, with respect to the specific type. Such an approach avoids the data-flow analysis and can be practical. We develop two important security applications based on TyPM: refining indirect-call targets and protecting critical data structures. We extensively evaluate TyPM with various system software, including an OS kernel, a hypervisor, UEFI firmware, and a browser. Results show that on average TyPM additionally refines indirect-call targets produced by the state of the art by 31%-91%. TyPM can also remove 99.9% of modules for memory-write instructions to prevent them from corrupting critical data structures in the Linux kernel.
more »
« less
- PAR ID:
- 10426461
- Date Published:
- Journal Name:
- 2023 IEEE Symposium on Security and Privacy (SP)
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Indirect function calls are widely used in building system software like OS kernels for their high flexibility and performance. Statically resolving indirect-call targets has been known to be a hard problem, which is a fundamental requirement for various program analysis and protection tasks. The state-of-the-art techniques, which use type analysis, are still imprecise. In this paper, we present a new approach, TFA, that precisely identifies indirect-call targets. The intuition behind TFA is that type-based analysis and data-flow analysis are inherently complementary in resolving indirect-call targets. TFA incorporates a co-analysis system that makes the best use of both type information and data-flow information. The co-analysis keeps refining the global call graph iteratively, allowing us to achieve an optimal indirect call analysis. We have implemented TFA in LLVM and evaluated it against five famous large-scale programs. The experimental results show that TFA eliminates additional 24% to 59% of indirect-call targets compared with the state-of-the-art approaches, without introducing new false negatives. With the precise indirect-call analysis, we further developed a strengthened fine-grained forward-edge control-flow integrity scheme and applied it to the Linux kernel. We have also used the refined indirect-call analysis results in bug detection, where we found 8 deep bugs in the Linux kernel. As a generic technique, the precise indirect-call analysis of TFA can also benefit other applications such as compiler optimization and software debloating.more » « less
-
Balzarotti, Davide; Xu, Wenyuan (Ed.)Kernel privilege-escalation exploits typically leverage memory-corruption vulnerabilities to overwrite particular target locations. These memory corruption targets play a critical role in the exploits, as they determine which privileged resources (e.g., files, memory, and operations) the adversary may access and what privileges (e.g., read, write, and unrestricted) they may gain. While prior research has made important advances in discovering vulnerabilities and achieving privilege escalation, in practice, the exploits rely on the few memory corruption targets that have been discovered manually so far. We propose SCAVY, a framework that automatically discovers memory corruption targets for privilege escalation in the Linux kernel. SCAVY's key insight lies in broadening the search scope beyond the kernel data structures explored in prior work, which focused on function pointers or pointers to structures that include them, to encompass the remaining 90% of Linux kernel structures. Additionally, the search is bug-type agnostic, as it considers any memory corruption capability. To this end, we develop novel and scalable techniques that combine fuzzing and differential analysis to automatically explore and detect privilege escalation by comparing the accessibility of resources between executions with and without corruption. This allows SCAVY to determine that corrupting a certain field puts the system in an exploitable state, independently of the vulnerability exploited. SCAVY found 955 PoC, from which we identify 17 new fields in 12 structures that can enable privilege escalation. We utilize these targets to develop 6 exploits for 5 CVE vulnerabilities. Our findings show that new memory corruption targets can change the security implications of vulnerabilities, urging researchers to proactively discover memory corruption targets.more » « less
-
Indirect calls, while facilitating dynamic execution characteristics in C and C++ programs, impose challenges on precise construction of the control-flow graphs (CFG). This hinders effective program analyses for bug detection (e.g., fuzzing) and program protection (e.g., control-flow integrity). Solutions using data-tracking and type-based analysis are proposed for identifying indirect call targets, but are either time-consuming or imprecise for obtaining the analysis results. Multi-layer type analysis (MLTA), as the state-of-the-art approach, upgrades type-based analysis by leveraging multi-layer type hierarchy, but their solution to dealing with the information flow between multi-layer types introduces false positives. In this paper, we propose strong multi-layer type analysis (SMLTA) and implement the prototype, DEEPTYPE, to further refine indirect call targets. It adopts a robust solution to record and retrieve type information, avoiding information loss and enhancing accuracy. We evaluate DEEPTYPE on Linux kernel, 5 web servers, and 14 user applications. Compared to TypeDive, the prototype of MLTA, DEEPTYPE is able to narrow down the scope of indirect call targets by 43.11% on average across most benchmarks and reduce runtime overhead by 5.45% to 72.95%, which demonstrates the effectiveness, efficiency and applicability of SMLTA.more » « less
-
null (Ed.)The recent Spectre attack first showed how to inject incorrect branch targets into a victim domain by poisoning microarchitectural branch prediction history. In this paper, we generalize injection-based methodologies to the memory hierarchy by directly injecting incorrect, attacker-controlled values into a victim's transient execution. We propose Load Value Injection (LVI) as an innovative technique to reversely exploit Meltdown-type microarchitectural data leakage. LVI abuses that faulting or assisted loads, executed by a legitimate victim program, may transiently use dummy values or poisoned data from various microarchitectural buffers, before eventually being re-issued by the processor. We show how LVI gadgets allow to expose victim secrets and hijack transient control flow. We practically demonstrate LVI in several proof-of-concept attacks against Intel SGX enclaves, and we discuss implications for traditional user process and kernel isolation. State-of-the-art Meltdown and Spectre defenses, including widespread silicon-level and microcode mitigations, are orthogonal to our novel LVI techniques. LVI drastically widens the spectrum of incorrect transient paths. Fully mitigating our attacks requires serializing the processor pipeline with lfence instructions after possibly every memory load. Additionally and even worse, due to implicit loads, certain instructions have to be blacklisted, including the ubiquitous x86 ret instruction. Intel plans compiler and assembler-based full mitigations that will allow at least SGX enclave programs to remain secure on LVI-vulnerable systems. Depending on the application and optimization strategy, we observe extensive overheads of factor 2 to 19 for prototype implementations of the full mitigation.more » « less
An official website of the United States government

