<?xml-model href='http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng' schematypens='http://relaxng.org/ns/structure/1.0'?><TEI xmlns="http://www.tei-c.org/ns/1.0">
	<teiHeader>
		<fileDesc>
			<titleStmt><title level='a'>HeapExpo: Pinpointing Promoted Pointers to Prevent Use-After-Free Vulnerabilities</title></titleStmt>
			<publicationStmt>
				<publisher></publisher>
				<date>12/07/2020</date>
			</publicationStmt>
			<sourceDesc>
				<bibl> 
					<idno type="par_id">10250363</idno>
					<idno type="doi">10.1145/3427228.3427645</idno>
					<title level='j'>ACSAC '20: Annual Computer Security Applications Conference</title>
<idno></idno>
<biblScope unit="volume"></biblScope>
<biblScope unit="issue"></biblScope>					

					<author>Zekun Shen</author><author>Brendan Dolan-Gavitt</author>
				</bibl>
			</sourceDesc>
		</fileDesc>
		<profileDesc>
			<abstract><ab><![CDATA[Use-after-free (UAF) vulnerabilities, in which dangling pointers remain after memory is released, remain a persistent problem for applications written in C and C++. In order to protect legacy code, prior work has attempted to track pointer propagation and invalidate dangling pointers at deallocation time, but this work has gaps in coverage, as it lacks support for tracking program variables promoted to CPU registers. Moreover, we find that these gaps can significantly hamper detection of UAF bugs: in a preliminary study with OSS-Fuzz, we found that more than half of the UAFs in real-world programs we examined (10/19) could not be detected by prior systems due to register promotion. In this paper, we introduce HeapExpo, a new system that fills this gap in coverage by parsimoniously identifying potential dangling pointer variables that may be lifted into registers by the compiler and marking them as volatile. In our experiments, we find that HeapExpo effectively detects UAFs missed by other systems with an overhead of 35% on the majority of SPEC CPU2006 and 66% when including two benchmarks that have high amounts of pointer propagation.]]></ab></abstract>
		</profileDesc>
	</teiHeader>
	<text><body xmlns="http://www.tei-c.org/ns/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink">
<div xmlns="http://www.tei-c.org/ns/1.0"><p>the last year <ref type="bibr">[8]</ref>. Use-after-free bugs are found in a wide variety of projects, including web browsers, utility programs and libraries.</p><p>The root cause of use-after-free vulnerabilities is dangling pointers which point to freed memory without being correctly cleared. If a dangling pointer is dereferenced, it may access memory of another object, leading to either information leak vulnerabilities (if the access is a read) or memory corruption (if the access is a write).</p><p>Use-after-free bugs are hard to detect and debug because they may not cause a crash immediately. Manual review for use-afterfree vulnerabilities is time-consuming and does not scale to large programs. Analysis of heap-related code requires reviewers to understand the pointer propagation behavior throughout the entire code base. Although heap-related code is only a small portion of the whole program, there are many possible allocation and deallocation sequences, and use-after-free bugs may only manifest only under a small fraction of those sequences. As a result, it can take significant effort to manually uncover use-after-free vulnerabilities.</p><p>Prior work, such as FreeSentry <ref type="bibr">[28]</ref>, DangNull <ref type="bibr">[16]</ref>, DangSan <ref type="bibr">[26]</ref>, and pSweeper <ref type="bibr">[13]</ref> has sought to eliminate dangling pointers by invalidating them automatically when releasing dynamic memory. This approach incurs less overhead than other tools like Address Sanitizer <ref type="bibr">[25]</ref>, which checks the validity of every memory read or write. The downside the of invalidation approach is its unwanted false-negatives: previous work cannot track local variables and function parameters which are promoted to registers. Because promoting stack memory to registers is a common compiler optimization, this renders many potential dangling pointers untrackable. In our analysis of 19 bugs from the OSS-Fuzz <ref type="bibr">[9,</ref><ref type="bibr">23]</ref> project (described in Section 4.1), we found that local variables and function parameters appear often in use-after-free bugs: 10 of the 19 bugs we examined are caused by variables that were promoted to registers by standard compiler optimization and would have been missed by prior work.</p><p>To close this gap in coverage, in this paper we present HeapExpo, a dangling pointer sanitizer that also tracks pointers in local variables and function arguments. As with previous works, we achieve pointer tracking by using LLVM infrastructure <ref type="bibr">[15]</ref> to instrument pointer propagation instructions and provide a runtime library to track and manage metadata about allocations. Our analysis identifies pointer variables that may be optimized into registers by the compiler and marks them as volatile, forcing the compiler to keep them on the stack where they can be tracked by our runtime. However, because a na&#239;ve approach of marking every pointer variable adds prohibitive overhead, we additionally provide a static analysis that safely identifies pointer variables that can never be involved in use-after-free bugs, and allows these to be optimized freely.</p><p>Despite providing more comprehensive coverage, our optimized implementation has an overhead of 66% on the SPEC CPU2006 benchmark <ref type="bibr">[14]</ref>, only a modest increase compared to the 46% overhead from the state-of-the-art dangling pointer sanitizer, Dan-gSan <ref type="bibr">[26]</ref>.</p><p>We make the following contributions:</p><p>&#8226; We identified some major sources of dangling pointers that are not tracked by previous works due to register promotion by studying and categorizing UAF bugs discovered by the OSS-Fuzz project. &#8226; We present a novel approach to identify and track pointers promoted to registers by marking them as volatile. &#8226; We design an analysis that uses liveness and the program's function call graph to reduce the number of tracked stack variables, greatly reducing the overhead for comprehensive tracking.</p><p>&#8226; We provide a performance analysis that demonstrates our system incurs about 20% run-time overhead in addition to DangSan, with little extra memory overhead.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">BACKGROUND</head><p>In this section, we walk through how previous work solves the dangling pointer problem, their limitations and the LLVM tool chain we studied and used. We also discuss the implementation of a state of the art dangling pointer sanitizer, DangSan, which helps the understanding of our design as a whole. Additionally, we discuss two threading models that influence our optimization.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1">LLVM Compiler</head><p>LLVM is a modular compiler tool chain. The front-end client, clang, translates the source code into an Intermediate Representation (LLVM IR) with a limited number of instructions. Optimizing passes process LLVM IR code and produce optimized LLVM IR. Common optimization techniques like dead code removal, function inlining, and alias analysis are applied in this phase. The LLVM IR is then handed to a target-dependent backend compiler to generate machine code for the host. Finally, the linker is invoked to combine all machine code files into a single executable. LLVM Optimization. The LLVM front end first generates unoptimized code as in Listing 3 from the C code in Listing 2. At this point, all local variables and function arguments live on the stack. a lives in the stack location indicated by the alloca instruction at line 2. Then, the LLVM optimizer processes the unoptimized code with the mem2reg pass <ref type="bibr">[10]</ref>, which promotes memory references to registers. This is one of the very first optimization passes to obtain Single Source Assignment form. The end result is shown in Listing 4. LLVM provides options to instrument before or after the mem2reg pass by choosing an appropriate optimization stage.</p><p>The algorithm to determine whether a stack variable can be promoted is shown in Listing 1. The algorithm shows that a regular non-volatile pointer, either as a local variable or a function argument, is indeed promotable. Later, our analysis of OSS-Fuzz bugs (Section 4.1) follows the cases in this algorithm.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2">Temporal safety system design</head><p>Currently, there are four main approaches to ensure temporal safety: secure allocators, address-based checking, garbage collection, and dangling pointer invalidation. We compare the four approaches in  <ref type="table">1</ref>. Our system focuses on the pointer invalidation approach because it is fast, has low memory overhead, and is hard to bypass. HeapExpo closes a major gap in coverage of dangling pointer detectors with only a modest increase in overhead.</p><p>Secure allocator <ref type="bibr">[3,</ref><ref type="bibr">6,</ref><ref type="bibr">22]</ref>. This approach provides a custom allocator which restricts reuse of same memory region. This prevents a dangling pointer from pointing to other allocated objects, making it unexploitable. However, this approach can be bypassed if the memory reuse pattern can be learned by an attacker as discussed by Lee et al. <ref type="bibr">[16]</ref>.</p><p>Address-based checking <ref type="bibr">[4,</ref><ref type="bibr">19,</ref><ref type="bibr">20,</ref><ref type="bibr">25]</ref>. This type of temporal safety approach tries to invalidate the memory addresses of a heap object when it is released. Another often-used technique is to raise an alert when dangling pointers are used. This approach usually discourages memory reuse so that a dangling pointer does not point to a new object right away.</p><p>Garbage collection <ref type="bibr">[1,</ref><ref type="bibr">2,</ref><ref type="bibr">27]</ref>. This is a passive reference counting technique that scans memory for potential pointers. Dynamic memory is released only when there are no pointer references. Therefore, this method can only mitigate use-after-free, not detect it. The runtime overhead is tied to the number of memory scans performed, so it often trades memory for speed.</p><p>Dangling pointer invalidation <ref type="bibr">[13,</ref><ref type="bibr">16,</ref><ref type="bibr">26,</ref><ref type="bibr">28]</ref>. This approach tracks the propagation of pointers inside memory. The propagation is tracked by taint analysis or monitoring certain instructions. When a heap object is released, the pointers that reference the object also get invalidated. The invalidation can be performed by setting the dangling pointer to kernel space or null. Dereference of a kernel space dangling pointer results in a crash immediately. Setting the pointer to null can let the program execute normally if there is null pointer check.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.3">Limitations of Prior Work</head><p>Prior work <ref type="bibr">[16,</ref><ref type="bibr">26,</ref><ref type="bibr">28]</ref> that keeps an active representation of memory is based on LLVM, with instrumentation done by LLVM passes. DangNull <ref type="bibr">[16]</ref>  and does not track pointers on the stack or in registers. FreeSentry <ref type="bibr">[28]</ref> and DangSan <ref type="bibr">[26]</ref> optionally support tracking of pointers on the stack, but do not track those in registers. The authors of all three systems mention this limitation, but they do not provide an estimation of how many false negative it causes. We have summarized coverage among previous works in Table <ref type="table">2</ref>.</p><p>We have covered LLVM's mem2reg <ref type="bibr">[10]</ref> optimization pass in the previous subsection. LLVM uses this pass to promote local variables from stack memory to registers. Since these works <ref type="bibr">[16,</ref><ref type="bibr">26,</ref><ref type="bibr">28]</ref> instrument optimized code, they do not track promoted local variables and function arguments, which leaves a large portion of pointer code uninstrumented.</p><p>The C code in Listing 2 has a use-after-free vulnerability with its local variable a. Compiling the code with -O0 or -O2 can result in the following IR code in Listings 3 and 4. Past works instrument the IR after mem2reg pass; thus, they process the IR code in Listing 4, where a has been promoted from stack location to LLVM register. FreeSentry and DangSan cannot track a in this context, because it does not have stack location. To ensure temporal safety with low time overhead, they sacrifice completeness. In our work, we target tracking of local variables and function arguments.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.4">DangSan Implementation</head><p>DangSan is a dangling pointer sanitizer implemented using the LLVM compiler tool chain. It consists of compile-time LLVM passes and run-time code linked to the final binaries. The LLVM pass instruments pointer write instructions with calls to tracking code. The run-time code overloads all allocator functions including malloc, </p><p>call @free(i8* %1) 4 %2 = call @puts(i8* %1) 5 } Listing 4: LLVM IR with -O2, local variable lives in %1 as LLVM register calloc, free, realloc, memalign, aligned_alloc, valloc, pvalloc and posix_memalign to keep track of heap memory layout. The run-time code also provides functions to track pointer propagation so that instrumented calls can update the data structure.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.4.1">Heap Data</head><p>Structure. The heap is represented as a directed graph. Allocated memory blocks are nodes and pointers are directed edges that link memory blocks. Pointers to heap memory can exist in heap memory, the data section of the executable, stack memory, and registers. In DangSan's design, each dynamic memory block has associated metadata which tracks the sets of all inward pointers. DangSan chose not to record outward edges in the metadata for performance reasons.</p><p>The metadata of the heap objects is kept in a three-level shadow memory map. Querying the metadata of dynamic objects takes constant time with three memory reads.</p><p>The metadata consists of sets of pointers, and is implemented in the manner of a file system. When the direct log is filled up, an alternative hash map is used instead. The hash map is used to eliminate duplicates to prevent the log from growing without constraints. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Listing 5: Alloc and Dealloc function hook</head><p>DangSan modifies gperftools' tcmalloc and adds hook functions to memory allocation and deallocation. All allocator functions are mapped to basic malloc() and free() functions. Because memcpy() is not tracked by DangSan, a realloc() is treated as a malloc() followed by a free(). As shown in Listing 5, allocation and deallocation hooks have the job of managing the three-level shadow memory. The allocation hook marks the region of the memory object with the metadata, while the deallocation hook clears the metadata. The deallocation hook has the additional task of managing dangling pointers. For all inward pointers in the metadata record, a value check is performed. If the pointer still refers to the freed object, the pointer is invalidated by setting its most significant bit to 1.</p><p>The pointer records are created with regptr calls that are instrumented by the LLVM pass. The regptr function checks if the pointer points to a recorded object and, if so, records the pointer in the metadata of the object. The pseudo-code is shown in Listing 6.</p><p>DangSan also includes another LLVM pass that checks the data section of the program and inserts a global constructor function that registers global variables in data sections.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.4.2">Tracking pointers in memory.</head><p>The compile-time instrumentation is designed as an LLVM pass that processes LLVM IR. Propagating pointers to data and heap sections guarantees a memory write instruction. LLVM IR uses the store instruction for memory writes, and it also indicates the type of stored values. Therefore, one only needs to instrument store instructions with pointer type: store PtrTy* ptr_loc, PtrTy ptr_val</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.5">Threading Model</head><p>In this section, we discuss two essential threading models that affect our implementation: the global and thread-local assumptions. In the most general case, the global model assumes that a registered pointer may be invalidated by a free() in any thread. By contrast, thread-local model assumes that a registered pointer may only be invalidated by a free() in the same thread. Listing 6: Functions that provide pointer tracking Our pointer invalidation system follows the global model by maintaining a synchronous data structure. We ensure the correctness of the heap representation under multiple threads. A free() invalidates pointers stored on the stack for every thread.</p><p>However, to enable further optimizations for local variable tracking, we rely on the thread-local model. This model allows further static analysis and covers most common situations. We note that this assumption is shared by prior work; DangSan and FreeSentry both rely on it for their loop optimization pass.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">DESIGN AND IMPLEMENTATION</head><p>As shown in Figure <ref type="figure">1</ref>, we designed HeapExpo on top of DangSan, keeping its instrumentation and runtime libraries. DangSan's transformation passes run at link time, when stack variables have already been promoted to registers. We therefore add another step before the mem2reg pass to preserve relevant stack variables. When Dan-gSan's optimizations take place, these variables are still tracked as on stack. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1">Tracking Local Variable and Function Argument</head><p>As previously discussed, dangling pointers can be stored in the data section, on the heap, on the stack, and in registers. We have discussed how prior work achieved tracking of pointers in memory.</p><p>In this section, we discuss how we track pointers promoted to registers. Burden of register tracking. Computers have limited number of registers. Therefore, the LLVM back-end maps the unbounded set of LLVM registers to machine registers according to their liveness. Because LLVM passes operate at the IR level, our pass is not aware of machine registers and thus cannot easily track them. To ensure complete coverage, invalidating pointers in registers is required. To solve this problem, we propose a method that tracks local variables and function arguments without checking machine registers; instead, we identify pointer variables that need to be tracked and prevent them from being promoted to registers in the first place.</p><p>We run our LLVM pass before the mem2reg pass; at this point, local variables and function arguments are all stored on the stack. To track a local variable or function argument, we mark it as volatile, ensuring that further optimization will keep the variable on the stack. We do this by simply marking the relevant store and load instructions as volatile in our LLVM pass. We do not modify other variables and arguments and let the mem2reg pass promote them to registers.</p><p>When a memory block is released, we check all associated pointers, including volatilized pointers. If they still point to the region, we invalidate the pointers by making them point to invalid memory space. Dereference of the pointers thus causes a segmentation fault.</p><p>One downside of this strategy is that it results extra memory reads and writes, adding to the overhead to our system. To make HeapExpo practical, we must therefore develop optimization techniques that only track pointers that can potentially cause use-afterfree. Overall, the optimizations can reduce number of instrumented instructions by half, leading to a significant speedup. We describe these optimizations next.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2">Liveness Analysis</head><p>Tracking local variables by keeping them on the stack has a performance impact because the compiler cannot optimize them into registers. To reduce number of tracked variables, we apply liveness analysis to local pointers and pointer arguments to functions based on control flow within each function.</p><p>We say that a local pointer becomes a live dangling pointer when (1) a function call invalidates it, and (2) it may be dereferenced later in the function execution. There is no need to instrument local pointers that are not live. In this way, we reduce the number of instrumentation added and the number of stack variables tracked.</p><p>First, we find all definitions and uses of local pointer variables and function parameters. Our LLVM pass takes place before the mem2reg pass, so every store instruction is a definition, and every load instruction is a use. At each function call, we make every live stack variables volatile. Specifically, we mark as volatile the last store before the call instruction and the first load after it.</p><p>In Listing 7, suppose we perform a check for main at line 14. We want to examine the liveness of local pointers s1, s2, and s3 at call to f(). s1 is defined at line 11 and used at line 16. It is possible that the value of s1 is invalidated inside f(). Hence we make s1 volatile by making the store and load volatile. Also, we register the stack location of s1 for tracking. s2 is defined at line 13 but never used after f(). Thus, we do not need to track s2, as it cannot cause a use-after-free. Although s3 is used at line 18, it is always guarded by the definition at line 15. s3 may be invalidated in f(), but the definition will always overwrite the old address before use. We thus do not need to track s3 either.</p><p>In the example above, we should repeat the above check with every function call in main, but we will show next that the check at g() is not necessary.</p><p>With the control flow graph of the function, we can easily assess the liveness of a variable. If there is a path from the function entry point to some definition of a local pointer, and then to the call instruction, it means that the pointer is potentially defined at the call instruction. Similarly, if there is a path from the call instruction to a usage, without encountering any definitions, it means the local pointer is potentially dereferenceable after the call instruction. If the local pointer is both potentially defined before the function call, and potentially dereferenceable after the function call, we say the pointer variable is live. We repeat the process with every call instruction in the function to obtain a set of live local pointers to track. Any local variables and function parameters that are not live can be safely promoted to LLVM registers and optimized.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3">Call Graph Analysis</head><p>In the previous subsection, we examined the liveness of every local pointer variable at each function call, assuming that the variable may be freed inside the called function. Here, we introduce a call graph analysis that can reduce the number of liveness checks.</p><p>Under the thread-local assumption, a stack pointer can only be invalidated inside its thread, which means the function the pointer is in must call free() to release memory. A call to a function that does not release any memory cannot invalidate any stack variables. Therefore, we can avoid adding checks at those calls.</p><p>We conservatively gather this information about functions with the following algorithm. We conservatively regard external functions and indirect jumps as possibly calling free. In order to make library function calls precise, we can provide C/C++ library calls that never call free in our model. This way, we can eliminate some call instructions to instrument. In our running example, we do not need to add a check at the call to g() in Listing 7 because g() does not release any memory.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.4">Volatile Dropping Optimization</head><p>Combined liveness and call graph analysis can help us identify stack variables that may introduce UAF. We then mark these stack variables as volatile for later instrumentation. The volatilization must occur before other passes like mem2reg optimize the code during compilation. However, because the each source file is compiled independently, library functions from other files are not available until link time. Therefore, we perform a second pass of call graph analysis at link time and drop unnecessary volatile instructions. This pass is implemented as a link time optimization pass. We explicitly invoke mem2reg after volatile instruction dropping to ensure newly optimized stack locations are lifted to registers.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">EVALUATION</head><p>In this section, we evaluate the effectiveness of HeapExpo by analyzing OSS-Fuzz bugs <ref type="bibr">[9,</ref><ref type="bibr">23]</ref>, and its performance by benchmarking its time and space overheads compared to previous work.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1">Survey of UAF Bugs Discovered by OSS-Fuzz</head><p>To estimate how many bugs may be missed by prior work that cannot track pointers through registers, we study use-after-free bugs found in the OSS-Fuzz database <ref type="bibr">[23]</ref>, and manually categorize the sources of dangling pointers. Our process is similar to debugging where we identify the source of each dangling pointer as a heap pointer, global pointer, local variable pointer, etc.</p><p>Reproducing older bugs from OSS-Fuzz is not completely straightforward, as individual projects in OSS-Fuzz are continually updated to their latest versions, and it is nontrivial to identify the correct revision for both the project and its dependencies that will allow the bug to manifest. Once the correct version is found, we can then build the fuzzer component and use the provided test case to reproduce the crash. To identify the correct versions and analyze the bug, we:</p><p>&#8226; Find a commit of project source that references the bug reported by OSS-Fuzz &#8226; Find corresponding commits of upstream dependencies &#8226; Find the commit of OSS-Fuzz at the time &#8226; Build the executable and libraries with the correct versions of their source code &#8226; Use the OSS-Fuzz test case to reproduce the crash &#8226; Rebuild with debugging flags "-O0 -g" &#8226; Perform dynamic analysis with a debugger We selected projects with reports of use-after-free bugs from OSS-Fuzz database. We chose projects with an eye toward ease of bug reproduction, based on the number of dependencies and the complexity involved in compiling the fuzzer. Large projects usually have complicated build scripts, making it difficult to turn off optimizations. Without disabling optimization, the source of dangling pointers can be ambiguous. In the end, we were able to reproduce 19 bugs from 11 projects as shown in Table <ref type="table">4</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Source of Dangling Pointer Occurrence</head><p>Global We note that our selection process may bias our corpus toward smaller, simpler projects, and may not be representative of all UAF bugs in the wild. However, we argue that even this relatively small convenience sample indicates the importance of tracking local variables and function arguments.</p><p>We categorize the reproduced bugs by the source of dereferenced dangling pointers, manually reviewing reproduced crashes. Referencing the source code with the crash logs, we are able to identify the source of each dangling pointer. The result of our manual review is shown in Table <ref type="table">3</ref>. Reference of pointer is the case where the function argument is a pointer (reference) to a pointer. If the compiler inlines the function and promotes the referenced pointer, prior work cannot track the pointer. Transformed pointer means that before the dangling pointer is used, it is stored as non-pointer data. In the case listed in the Table, the pointer is divided by two before being stored to memory to save one bit of space.</p><p>Even if a dereferenced dangling pointer is stored in local variables or function arguments, it is not necessarily the source of the dangling pointer. The underlying reason is that an invalidated dangling pointer can be propagated to these variables. To avoid such mis-categorization, we check whether recent pointer store in function control flow propagates a valid pointer or an invalidated dangling pointer. If the recent pointer store propagates an invalidated pointer, as in Listing 8, the source of the UAF in func1 is identified as a function argument, while that of func2 is identified as global variable, because the dangling global variable is propagated to func2's argument. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>OSS-Fuzz</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2">Effectiveness of UAF Detection</head><p>In this section, we test the effectiveness of HeapExpo against useafter-free bugs in a real-word program and our own crafted examples. We tested HeapExpo and DangSan with a QuickJS exploit <ref type="bibr">[5,</ref><ref type="bibr">24]</ref>. The fact that the dangling pointer comes from a local variable makes DangSan incapable of detecting the exploit. On the other hand, HeapExpo is able to track and invalidate the dangling pointer in the local variable and correctly triggers a crash when the invalidated pointer is dereferenced.</p><p>Besides manually crafted tests, we also created sample test code by extracting snippets of buggy code from the use-after-free bugs discovered by OSS-Fuzz, which makes it easier to build with Dan-gSan and HeapExpo. This sample code also allow us to easily compare between the two systems. The details can be found in our GitHub repository. The results from the sample code show that HeapExpo correctly tracks dangling pointers in local variables and function arguments.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Source</head><p>As shown in Table <ref type="table">3</ref>, we observed that the major sources of dangling pointers are heap variables, function arguments and local variables. According to the result in Table <ref type="table">5</ref>, HeapExpo is able to track two of the major sources where DangSan fails. Neither DangSan nor HeapExpo can track the transformed pointer case in libaom <ref type="bibr">[18]</ref>. We closely studied the pointer reference case, and we believe that it is unlikely to be protected by DangSan, because the function is small and the referenced pointer is a function argument. By contrast, HeapExpo succeeds in this case by preserving the stack location of the function argument.</p><p>The exploitability of use-after-free vulnerability usually depends on how attackers could reliably arrange the heap. Limiting the attack within a function's lifespan makes it hard to align dynamic memory, but we have seen that the QuickJS exploit <ref type="bibr">[24]</ref> leverages a dangling local variable. Furthermore, we cross checked OSS-Fuzz and CVE database to search for use-after-free bugs caused by dangling local variables. Although only a few OSS-Fuzz bugs were also present in the CVE database, we found two such examples, CVE-2019-17534 (OSS-Fuzz id: 16796) and one of the CVE-2018-1000039 bugs (OSS-Fuzz id: 5492), with high risk scores of 8.8 and 7.8 respectively. This demonstrates that the bugs missed by prior systems can indeed be exploitable.</p><p>During the course of our evaluation we also found a minor bug in DangSan's pointer packing methods that led to it missing some stack variables. The sample code registers two stack pointer locations, but DangSan fails to register and invalidate the second registered location. By reviewing DangSan's packing code, we found that the packed value is never written back. After we fixed the bug, we are able to correctly track stack variables with DangSan.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3">Performance and Memory Overhead</head><p>We tested the performance of HeapExpo on CPU-intensive benchmarks, comparing the run time of HeapExpo and DangSan with the common unprotected baseline. We mainly used the SPEC CPU2006 benchmark <ref type="bibr">[14]</ref>, which was also used to evaluate DangSan. We found several benchmarks that executed correctly under DangSan but not HeapExpo (gcc, perlbench and libquantum). We believe that these failures are caused by bugs in the underlying benchmark programs, as DangSan's authors also reported having to patch some programs in the CPU2006 benchmark suite in order to fix UAFs. We compare the time overhead between HeapExpo and DangSan on the remaining benchmarks.</p><p>Our experiments were run on a server with two Intel Xeon X5690 CPUs @3.47GHz. We ran the benchmarks 3 times and report the median. The result of individual time overhead is shown in Figure <ref type="figure">2</ref>. Although we could not directly benchmark FreeSentry and DangNull because of unpublished optimizations and closed sourced code, we estimated and included the results reported in their papers. Aside from a few missing benchmarks, HeapExpo still performed as well as those conventional approach, let alone the extra coverage of local pointers. We computed the geometric means of the time overheads from DangSan and HeapExpo. Our geometric mean over common benchmarks is 1.66, with 66% overhead from baseline, while DangSan has an overhead of 46%. Our overhead on top of DangSan is therefore 20%. Without propagation intensive programs omnetpp and xalancbmk, HeapExpo has geometric mean of 35% run-time overhead from non-instrumented baseline among 12 benchmarks.</p><p>In the experiment, we found runtime overhead is moderately correlated to number of pointer propagations that occurred with a correlation score of r = 0.61. The extra pointer propagation from newly tracked sources adds extra overhead. Therefore, the runtime overhead of pointer propagation intense benchmarks increases moderately when there are many new propagations. For example, in xalancbmk, there are 7.2 trillion pointer propagations tracked by HeapExpo versus 2.4 trillion tracked by DangSan. Although this causes the runtime overhead to increase from 2.2x to 3.5x, about three times as many pointer propagations are tracked by HeapExpo. We also measured the memory overhead of HeapExpo and Dan-gSan under SPEC CPU2006 benchmarks. As shown in Figure <ref type="figure">3</ref>, the memory overhead of HeapExpo is not significantly different from DangSan. We did not change the underlying data structures of DangSan, but we did increase the number of source type to track, so a small increase in memory overhead is expected. Due to the drastically increased pointer propagation behavior in xalancbmk benchmark, the memory overhead is increased as well. Among the benchmarks shared by all approaches, HeapExpo has a geometric mean of 100%, while DangSan has 87% and DangNull has 137% overhead. Although it requires slightly more memory than DangSan, HeapExpo consumes less memory than DangNull, which offers less protection. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.4">Impact of Optimization</head><p>Here, we show how our liveness analysis and call graph analysis affect the performance of HeapExpo. In Table <ref type="table">6</ref>, we compare the number of instrumented instructions among different builds. "HeapExpo (unopt)" has no static analysis optimization, and marks every stack pointer as volatile. The "HeapExpo" build has static analysis turned on and only marks stack pointers that may result in UAF bugs as volatile. Finally, the "DangSan" build does not mark any stack pointers volatile. Table <ref type="table">6</ref> indicates that HeapExpo eliminates about half the pointers that need instrumentation in the SPEC CPU2006 benchmarks compared to our base build, but still has about 1-2 times more instrumentation than DangSan as a result of our improved coverage. Figure <ref type="figure">4</ref> demonstrates that our optimizations greatly lower the amount of instrumentation and thus reduce runtime overhead.</p><p>According to the number of instructions, we can also see that stack pointers like local variables and function arguments make up a large portion of pointer propagations overall. Comparing the number of instrumented instructions between unoptimized HeapExpo build and the DangSan build, we argue that there are about 4 times more pointer propagation instructions present in the program than those tracked by DangSan. These local variables and function arguments are not trackable with any prior dangling pointer invalidation system. Promoted to registers, these pointers are pushed to the stack during function calls and popped when the function returns. Since no prior work tracks pushed stack locations, exploits that make use of these dangling pointers could evade detection.</p><p>The straightforward solution is to track all pointer variables. However, that would incur 413% more instrumentation, and make the overall overhead unacceptable. After applying our liveness and call graph analyses, we reduce the instrumentation overhead to 156%. This overhead is acceptable, but we can still do better by removing further unnecessary instrumentation using the call graph gathered during link time optimization. The number of instrumented instructions is consequently lowered to 86% more than DangSan, and we still guarantee the safety of all pointer variables under the thread-local assumption.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">LIMITATIONS</head><p>The design of our work follows earlier research including FreeSentry, DangNull and DangSan. For this reason, our work shares their common limitations. We discuss these limitations in this section.</p><p>Our system and other previous works are transformation passes based on LLVM tool chain, which means that source code is required. Recalling other approaches in literature, source code is often required to perform meaningful sanitizing with reasonable overhead. After compiler and linker optimizations, binary code is very different from its source, and meaningful information such as variable types is lost during the process. Although it is possible to retrieve a portion of the information through static analysis of binary code, the information would not be helpful enough for protection because of the lack of completeness. We agree with prior work that the limitation of requiring source code is an appropriate trade-off.</p><p>Like most sanitizers, HeapExpo can only detect temporal safety violations with instrumented code. That means the library source code is also required to detect use-after-free bugs in libraries. The linked libraries need to be compiled by the sanitizer as well. In practice, this is implemented in OSS-Fuzz where all library code as well as project code is compiled with the sanitizer flags. Finally like all current dangling pointer invalidation approaches, our work does not support custom pointer manipulation. Comparison between a regular pointer with an invalidated pointer is broken, but that between invalidated pointers is supported. Custom storage for pointers such as compacting pointers to smaller sizes is not trackable because type information is lost during the transformation. However, we believe this kind of direct pointer manipulation is relatively rare: we only see one such case (in libaom <ref type="bibr">[18]</ref>) among our reproduced bugs from OSS-Fuzz.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6">RELATED WORK</head><p>In this section, we discuss how our work relates to prior work in the literature. HeapExpo is an improvement on the dangling pointer invalidation approach for detecting use-after-free vulnerabilities. Although that is not the only approach, its combination of low overhead and strong protection compared with other methods makes it attractive.</p><p>Dangling Pointer Invalidation. The previous works using this approach are DangNull <ref type="bibr">[16]</ref>, FreeSentry <ref type="bibr">[28]</ref>, DangSan <ref type="bibr">[26]</ref> and pSweeper <ref type="bibr">[13]</ref>. Based on the LLVM toolchain, they all instrument pointer propagation instructions at the IR level. The earliest works, DangNull and FreeSentry, are conceptually similar but implemented with different data structures. They both instrument store instructions in the LLVM IR as an LLVM optimization pass. DangNull keeps a more complete data structure which records both inward and outward pointers, but it only tracks pointers in global data and on the heap. FreeSentry only registers inward pointers of an object, but it extends dangling pointer detection to pointers stored on the stack as well. Finally, DangSan extends FreeSentry by improving the data structures used and supporting multi-threading; DangSan uses a log-like data structure inspired by log-structured filesystems. They also support multithreading by managing thread-local data structures. In terms of overhead, DangSan is currently the state of the art.</p><p>These three works all manage pointer propagation inline, while pSweeper dedicates an extra thread to to manage the pointer metadata structures. It bookmarks the location of pointers and keeps a list of pointers as a simple data structure. Concurrently, it runs a separate thread that cycles through the bookmarked location to register pointers and to clean out dangling ones. The use of extra threads leverages idle CPU cycles in multicore systems to speed up dangling pointer invalidation. However, it uses more computing power overall due to extra worker threads.</p><p>The instrumentation strategy has remained the same across the four previous works, so none of them are able to track stack pointers promoted to registers. We studied a number of bugs in OSS-Fuzz database and found local variables and function arguments are the sources of more than half of the use-after-free bugs. Previous works cannot deal with such cases, so we target these false negatives in this paper.</p><p>Secure allocators. To ensure temporal safety, some prior work focuses on creating a more sophisticated allocator that can detects access to freed memory. Dhurjati and Adve <ref type="bibr">[11]</ref> put each allocation on a different virtual page, so that reads and writes to released pages raise an error. DieHarder <ref type="bibr">[22]</ref> marks freed chunks of memory and suppresses the reuse of those memory blocks so that dereferencing dangling pointers is likely to cause an exception. Cling <ref type="bibr">[3]</ref> is an allocator that introduces type-safe memory reuse that prevents corruption of control addresses by inferring object types from the allocation stack trace, so that dangling pointers will end up pointing to the same type of object even if the allocation is reused.</p><p>Address-based checking. In this approach, metadata of pointers and memory addresses are stored and checked at dereference time. Previous projects in the category include CETS <ref type="bibr">[12]</ref> and SafeC <ref type="bibr">[4]</ref>. CETS is another compiler-based use-after-free detection. It also supports checking for local variables. CETS associates pointers with the root object by pointing them to a key that indicates the validity of the memory object. When memory objects are released, these keys are invalidated. Access of the dangling pointer would raise an alert during a key check. When combined with Soft-Bound <ref type="bibr">[19]</ref>, it can achieve both spatial and temporal safety. As von der Kouwe et al. <ref type="bibr">[26]</ref> discussed, however, this approach has more run-time overhead and compatibility issues than dangling pointer invalidation approach. Lee et al. <ref type="bibr">[16]</ref> also pointed that it has high false positive rate, raising false alarms in 5 of 16 tested programs.</p><p>Safe C <ref type="bibr">[4]</ref> is source-level instrumentation that adds extra metadata to raw pointers, including the actual raw pointer, size, and the memory section it points to. It is able to perform bounds checking and check temporal safety using these attributes. However, sourcelevel instrumentation can reduce the opportunities the compiler has to optimize the program, and has a significant time overhead.</p><p>Debugging tools for use-after-free. The most widely used tools for detecting memory errors are Valgrind <ref type="bibr">[21]</ref> and Address Sanitizer <ref type="bibr">[25]</ref>. They also detect at the time of pointer dereference. They are generally comprehensive, but come with high performance overhead. Valgrind is built on a dynamic binary translation (DBT) framework. It translates and instruments binary code one block at a time. However, at the machine language level, the type of a memory location or register is ambiguous. Although we know a referenced register value has to be a pointer, pointers may also be used in arithmetic. Thus, pointer checking can often require a search through all of program memory, which is inefficient. Another possible approach using DBT would be to use taint analysis to track all the pointers. However, we argue that this approach has an extra overhead from taint analysis, and is unsuitable for runtime protection.</p><p>Address Sanitizer <ref type="bibr">[25]</ref> is a very effective tool for detecting memoryrelated errors. The essence of Address Sanitizer is using shadow memory to mark the validity of every address. It postpones the reuse of freed memory, so that use of a dangling pointer produces access to memory marked invalid. Address Sanitizer overloads the allocator functions and adds padding to memory blocks so that access to the padding also raises alerts. Address Sanitizer shows the effectiveness of shadow memory, but it cannot prevent sophisticated attacks which force the reuse of memory. Moreover, the run-time and memory overhead make it unsuitable for online protection use. Hardware Assisted Address Sanitizer hwasan is an optimization for memory usage on AArch64 and SPARC architectures. With a low false-negative rate (99.61%), it probabilistically detects UAF. Furthermore, its optimization makes dynamic memory layout easier to predict, and thus the system is more exploitable if probabilistic checking misses a dereference.</p><p>Garbage collection. MarkUs <ref type="bibr">[2]</ref> introduces many optimization techniques on top of garbage collection to tailor it for use-after-free mitigation. During garbage sweeping, it scans memory for potential pointers. Dynamic memory blocks are freed only when there is no potential pointer referencing them. It is incapable of detecting use-after-free by design and works only as a mitigation technique. Unable to distinguish pointer types from raw data, conservative garbage collection adds extra attack vector for memory exhaustion where an attacker can put random heap addresses in controlled buffers. Li and Tang <ref type="bibr">[17]</ref> launched such attack against MemGC <ref type="bibr">[1,</ref><ref type="bibr">27]</ref> to exploit Microsoft Edge.</p><p>Dangling pointer detection. Undangle <ref type="bibr">[7]</ref> focuses on dangling pointer detection that may result in use-after-free vulnerabilities. Undangle works offline by processing the execution trace and allocation log of the program. It applies taint analysis to track pointer propagation during execution. Taint analysis significantly increases the time overhead because every instruction must be instrumented to propagate taint information, meaning that this approach can only be used offline. We can provide similar functionality online by recording and reporting invalidated pointers.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7">CONCLUSION</head><p>In conclusion, we improve on prior work on temporal safety analysis. Focusing on dangling pointer invalidation, we support tracking of a broader range of pointers including local variables and function arguments. Through a review of 19 real-world UAF bugs from the OSS-Fuzz database, we found that 10 are ultimately caused by dangling pointers stored in local variables and function arguments, indicating that existing systems have significant gaps in coverage.</p><p>To close this gap, we introduced a novel approach that forces such pointers to be stored on the stack, allowing them to be tracked. However, tracking all such pointer variables and arguments can introduce unacceptably high overheads, so we applied a static analysis that identifies pointers that can never be involved in use-after-free bugs and excludes them from instrumentation.</p><p>HeapExpo successfully closes an important gap in detection of dangling pointers in C/C++ programs, and does so with reasonable additional overhead compared to prior work.</p><p>To aid in future research, we choose to open source our prototype HeapExpo at <ref type="url">https://github.com/messlabnyu/heap-expo</ref>.</p></div><note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_0"><p>We share another limitation with DangSan, where we do not track pointers that are copied in type-unsafe ways. For example, memcpy and memmove in glibc are not tracked because instrumenting them can result in non-trivial overhead. Tracking these functions is achievable with some overhead by using an implementation that also records outwards pointers. Although our prototype does not implement this, we think the type information could be recovered more easily with a pass closer to the front-end where all type information remains.We also note that the fact that LLVM often optimizes the memcpy function to trackable instructions like store makes this false negative less significant. In Listing 9, LLVM interprets line 11 as a memcpy call which copies data from a local to a global, so global.data is not registered at line 11. Thus, when the memory is released at line 15, global.data is not invalidated by our instrumentation. The use of global.data at line 16 does not raise an alert. However, LLVM will transform the memcpy call to two simpler instructions shown in lines 13-14, where line 13 is instrumented with a call to regptr in line 12. This occurs because we have a relatively small structure, so the cost for calling memcpy exceeds the two simple instructions. The idea is that one can customize the memcpy optimization to recover more pointer propagation. We did not implement this because of the limited number of bugs in this case, as von der Kouwe et al. discuss<ref type="bibr">[26]</ref>.</p></note>
		</body>
		</text>
</TEI>
