<?xml-model href='http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng' schematypens='http://relaxng.org/ns/structure/1.0'?><TEI xmlns="http://www.tei-c.org/ns/1.0">
	<teiHeader>
		<fileDesc>
			<titleStmt><title level='a'>Improving indirect-call analysis in LLVM with type and data-flow co-analysis</title></titleStmt>
			<publicationStmt>
				<publisher>USENIX</publisher>
				<date>08/11/2024</date>
			</publicationStmt>
			<sourceDesc>
				<bibl> 
					<idno type="par_id">10618374</idno>
					<idno type="doi"></idno>
					
					<author>D Liu</author><author>Shouling Ji</author><author>K Lu</author><author>Q He</author>
				</bibl>
			</sourceDesc>
		</fileDesc>
		<profileDesc>
			<abstract><ab><![CDATA[Indirect function calls are widely used in building system software like OS kernels for their high flexibility and performance. Statically resolving indirect-call targets has been known to be a hard problem, which is a fundamental requirement for various program analysis and protection tasks. The state-of-the-art techniques, which use type analysis, are still imprecise. In this paper, we present a new approach, TFA, that precisely identifies indirect-call targets. The intuition behind TFA is that type-based analysis and data-flow analysis are inherently complementary in resolving indirect-call targets. TFA incorporates a co-analysis system that makes the best use of both type information and data-flow information. The co-analysis keeps refining the global call graph iteratively, allowing us to achieve an optimal indirect call analysis. We have implemented TFA in LLVM and evaluated it against five famous large-scale programs. The experimental results show that TFA eliminates additional 24% to 59% of indirect-call targets compared with the state-of-the-art approaches, without introducing new false negatives. With the precise indirect-call analysis, we further developed a strengthened fine-grained forward-edge control-flow integrity scheme and applied it to the Linux kernel. We have also used the refined indirect-call analysis results in bug detection, where we found 8 deep bugs in the Linux kernel. As a generic technique, the precise indirect-call analysis of TFA can also benefit other applications such as compiler optimization and software debloating.]]></ab></abstract>
		</profileDesc>
	</teiHeader>
	<text><body xmlns="http://www.tei-c.org/ns/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink">
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>One of the powerful features of advanced programming languages like C/C++ is the function pointer, which allows developers to dynamically invoke different functions depending on the runtime context. Function pointers are widely used in modern software such as OS kernels to implement flexible Qinming He and Shouling Ji are the co-corresponding authors. and efficient functionalities. The invocation site of a function pointer is called an indirect-call (icall for short in this paper). However, the dynamic nature of icall makes it challenging to determine its possible targets. This challenge hinders the construction of a precise Control-Flow Graph (CFG), which is essential for many significant program analysis and protection tasks (e.g., bug detection, program optimization, and control-flow integrity).</p><p>Researchers have been actively proposing techniques to identify icall targets. There are two main categories of techniques for this problem. The first category is dynamic techniques, which record the icall targets at runtime and update the control flow graph accordingly <ref type="bibr">[7,</ref><ref type="bibr">23,</ref><ref type="bibr">27,</ref><ref type="bibr">51,</ref><ref type="bibr">53]</ref>. However, it has limitations in terms of soundness and performance, as it depends on the code coverage and the runtime support. Therefore, we focus on the second category: static techniques, which compute the possible icall targets without executing the program. Some static techniques rely on whole program pointer analysis to identify icall targets <ref type="bibr">[20,</ref><ref type="bibr">24,</ref><ref type="bibr">47,</ref><ref type="bibr">54]</ref>, but this approach faces challenges in balancing efficiency, soundness, and effectiveness <ref type="bibr">[29,</ref><ref type="bibr">42]</ref>. In particular, precise interprocedural pointer analysis is often infeasible for analyzing large programs such as OS kernels (e.g., Andersen <ref type="bibr">[21]</ref>). Moreover, the pointer analysis itself presupposes a global control flow graph, which implies knowing the icall targets beforehand.</p><p>As a result, most existing practical solutions rely on type analysis for identifying icall targets <ref type="bibr">[40,</ref><ref type="bibr">50,</ref><ref type="bibr">58,</ref><ref type="bibr">60,</ref><ref type="bibr">63,</ref><ref type="bibr">64]</ref>. Specifically, they consider all address-taken functions (i.e., functions whose addresses have been referenced) that have the same function type as the icall as possible targets. This is known as type matching or signature matching. This approach is common in CFI schemes [10, <ref type="bibr">48,</ref><ref type="bibr">50]</ref> and static program analysis <ref type="bibr">[28,</ref><ref type="bibr">60,</ref><ref type="bibr">63]</ref>. However, it suffers from high false positives, as many unrelated functions can also match the function type of the icall, especially for simple types (e.g., void (*)(int)). To address this limitation, two-layer type matching <ref type="bibr">[37,</ref><ref type="bibr">43,</ref><ref type="bibr">68]</ref> is proposed based on the observation that most icalls in large programs are stored and used in com-posite types like structures. By using the outer-layer struct type information as an additional filter, it can eliminate many invalid icall targets. TypeDive <ref type="bibr">[42]</ref> extends this idea and proposes multi-layer type matching (MLTA), where it tracks the outer-layer struct types recursively (i.e., one struct type is stored into another struct type) to obtain richer type information for icall analysis.</p><p>The state-of-the-art method, MLTA, has achieved great success in precisely identifying icall targets, and has been adopted in many static analysis tasks <ref type="bibr">[22,</ref><ref type="bibr">32,</ref><ref type="bibr">39,</ref><ref type="bibr">62,</ref><ref type="bibr">69]</ref>. However, it still suffers from several problems. <ref type="bibr">(1)</ref> It is ineffective for programs that rarely use composite types. For instance, we found that only 25% of icalls in the OpenSSL library involve composite types, and thus most icalls in OpenSSL cannot benefit from MLTA. <ref type="bibr">(2)</ref> Even for large programs that extensively use composite types, the impact of icalls that cannot benefit from MLTA (i.e., icalls that do not involve composite types) is highly significant. Our experimental results indicate that the minority (23%) icalls that do not involve composite types account for the majority of icall targets (77%) in the Linux kernel.</p><p>One fundamental problem of existing approaches is that they do not fully utilize type and data-flow information. In this paper, we present a new approach, TFA (Type and Flow co-Analysis), for precise and sound indirect-call resolution. The basic idea of TFA is to leverage the complementary advantages of type analysis and data-flow analysis to resolve indirectcall targets optimally. Specifically, for icalls with rich type information, type analysis (e.g., MLTA) could provide us an initial CFG, which enables the inter-procedural data-flow analysis. For icalls with limited or general type information (i.e., MLTA is ineffective), data-flow analysis kicks in to help eliminate unrelated targets. Given an initial CFG generated by MLTA, TFA iteratively improves it through type and dataflow co-analysis until convergence is reached. The refined CFG will make the subsequent analysis round more precise and more manageable because it narrows the scope of interprocedural analysis.</p><p>While the intuition of TFA is straightforward, implementing TFA needs to overcome two challenges. <ref type="bibr">(1)</ref> The type information used for icall analysis is not always accurate. Many type-based icall analysis methods <ref type="bibr">[37,</ref><ref type="bibr">42,</ref><ref type="bibr">43]</ref> are implemented based on LLVM <ref type="bibr">[13]</ref>, a popular static analysis framework. However, we discovered a series of LLVM composite types that lack essential information ( &#167;2.2). We call them broken types in this paper. Broken types affect the precision and soundness of type-based analysis and introduce both false positives and false negatives. (2) Directly performing inter-procedural data-flow analysis to track the data-flow relations between icalls and their targets is costly, especially for large programs like OS kernels. Furthermore, TFA needs to repeat the analysis iteratively, which poses a challenge to the efficiency of TFA.</p><p>To address the first challenge, we propose a type recov-ery technique that leverages the source code information to identify and reconstruct the missing or broken types. Our technique could enhance not only our icall analysis, but also general type-based analysis in LLVM. To address the second challenge, we present a two-dimensional data-flow analysis method customized for icall analysis. It captures both the data-flow dependencies between icalls and their targets, and the type information that is hidden behind complex data-flows.</p><p>The rich type information obtained by our data-flow analysis significantly improves the existing type analysis methods.</p><p>We have implemented TFA based on LLVM and evaluated TFA with five representative C/C++ programs: the Linux kernel, the FreeBSD kernel, the OpenSSL library, the OpenCV library, and the MongoDB database. TFA could analyze the complex Linux kernel in less than two hours. The experimental results show that TFA further eliminates 24% to 59% icall targets compared to the state-of-the-art multi-layer type analysis method. The evaluation also indicates that the type recovery method is effective in improving the soundness of type analysis. The data-flow analysis of TFA does not introduce any additional false negatives. We demonstrate the practical benefits of TFA in two applications: CFI and bug detection. We have implemented a prototype of a forward-edge CFI scheme leveraging TFA for the Linux kernel, where each icall's valid target set is computed according to its unique semantic. We also use TFA to detect bugs related to the misuse of device release call-back functions in the Linux kernel. We confirm 6 double-free bugs and 2 memory leak bugs that are hidden behind indirect-calls and would be missed by existing static bug detection tools. In summary, we make the following contributions:</p><p>&#8226; A new approach for icall target analysis. We propose a new approach, TFA, to precisely identify icall targets. TFA integrates the type analysis and data-flow analysis in such a way that they complement each other to compute icall targets optimally. The approach is designed to be practical for large programs, both kernel and user levels. We will open source TFA to facilitate further researches.</p><p>&#8226; New techniques. We propose multiple techniques to address challenges in implementing TFA. We propose a type recovery technique to fix the broken types in LLVM. We also design a customized two-dimensional data-flow analysis for icall target resolving, which not only precisely finds icall targets for cases where type analysis fails, but also identifies richer type information.</p><p>&#8226; Extensive evaluation and feasible applications. We extensively evaluate TFA's efficiency, effectiveness, and soundness on five large-scale programs. Compared with the stateof-the-art methods, TFA further eliminates 24% to 59% icall targets without introducing more false negatives. The evaluation results confirm that TFA's icall analysis is precise, scalable, and sound. We then implement a strengthened</p><p>1 typedef void (*f_ptr)(int a, int b); 2 struct S {f_ptr field1; f_ptr field2}; forward-edge CFI scheme based on our precise icall analysis results and apply it to the complex Linux kernel. We also apply TFA in bug detection and found 8 new bugs in the Linux kernel.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Background and Motivation</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1">Type-Based Indirect-Call Analysis</head><p>We present an example of an icall in Figure <ref type="figure">1</ref> to compare different type analysis methods. This icall occurs at line 16 and has a function type defined at line 1. A one-layer type matching method would consider any address-taken function with the same function type as the icall as a possible target. Thus, it would include all four functions defined from line 4 to line 7 as valid targets. A two-layer type matching method would also examine if an address-taken function is assigned to a field of a structure, which is a common pattern in large software such as OS kernels. In this example, the icall is retrieved from the first field (.field1) of struct S. Only address_taken_func1() and address_taken_func3() are used to initialize this field. Therefore, two-layer type matching identifies these two functions as valid targets of the icall, which eliminates 50% of false positives compared to one-layer type matching.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2">Problems with the Type Analysis in LLVM</head><p>LLVM is one of the most widely used static analysis frameworks, which is essential for implementing various type-based icall analysis techniques. However, LLVM suffers from two issues that compromise the soundness of type analysis. In this paper, we call these issues broken types.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2.1">Unreliable Type Equality Checking</head><p>One essential requirement of the aforementioned type-based analysis is to determine the equality of different types, especially for struct types. However, even such a fundamental</p><p>1 /* source code */ 2 struct A { 3 int i; 4 int (*f)(int, struct A*); 5 int (*g)(char, struct A*); 6 }; 1 /* source code */ 2 struct dvb_usb_adapter_properties adapter[2]; 3 4 /*Expected LLVM type of variable adapter */ 5 [2 x %struct.dvb_usb_adapter_properties] 6 7 /*Actual LLVM type of variable adapter */ 8 &lt;{{i8, i8, i32 (%struct.dvb_usb_adapter*, i32)*, 9 i32 (%struct.dvb_usb_adapter*, i32, i16, i32)*, {i8, i8, i8, 10 {%struct.anon.163, [8 x i8]}}}, %struct.dvb_usb_adapter_properties}&gt; task is hard to accomplish in LLVM intermediate representations (IRs). The default type-pointer comparison is only applicable for types within the same LLVM module. Some previous works <ref type="bibr">[42,</ref><ref type="bibr">43]</ref> rely on the hashed type strings to perform cross-module type comparison, which is effective for primitive types (e.g., integers). However, this approach is unsound in comparing struct types due to the following anomalies in LLVM type representations:</p><p>&#8226; Omitting function pointer fields. In some cases, the function pointer field of a structure will be replaced as an empty pointer ({}*) rather than its point-to type, as shown in Figure <ref type="figure">2</ref>. This issue has been reported and discussed in several different technical forums <ref type="bibr">[2]</ref><ref type="bibr">[3]</ref><ref type="bibr">[4]</ref><ref type="bibr">[5]</ref>, but it still persists in LLVM 15. Some developers suggest this is a known bug in Clang <ref type="bibr">[1,</ref><ref type="bibr">2]</ref>.</p><p>&#8226; Missing struct names. A common programming practice is to initialize a global variable at the point of its declaration (e.g., line 9 and line 11 in Figure <ref type="figure">1</ref>). In our analysis on the Linux kernel, we found that 17.8% of global struct variables are declared without specifying their struct type names in LLVM IRs. This poses a challenge for the twolayer type matching algorithm, as we cannot determine that address_taken_func1() is used to initialize a field of struct S in Figure <ref type="figure">1</ref>, which occurs for 13.5% of outer-layer struct types in the Linux kernel.</p><p>&#8226; Type unfolding. One challenge in analyzing the content of a struct's fields is that they may be partially unfolded, as shown in Figure <ref type="figure">3</ref>. Sometimes, an i64 type could be split into two i32 types. Based on our manual analysis, this scenario is likely to occur when a structure has a union field.</p><p>These type representations make type comparisons based on type strings unreliable. LLVM provides a built-in type comparison method <ref type="bibr">[9]</ref>, which recursively verifies every struct field of the two struct types. LLVM linker also employs another type comparison method based on isomorphism checking <ref type="bibr">[8]</ref>. Denisov et al. proposed a tree-automata-based type comparison method for LLVM <ref type="bibr">[6]</ref>. However, none of these methods can guarantee the soundness of type comparison.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2.2">Type Information Omission in Optimized Code</head><p>One of the key tasks in type-based analysis is to determine the type information of different layers (i.e., identify which field of a structure a variable belongs to). In LLVM IRs, this is done by using the getelementptr instructions, which are used to access struct fields. For instance, given a GEP instruction like %ptr = getelementptr inbounds %struct.S* %0, i64 0, i32 5, we can infer that the pointer %ptr originates from the fifth field of struct S.</p><p>Previous works on icall analysis have focused on programs compiled with no optimization (O0) <ref type="bibr">[42,</ref><ref type="bibr">43]</ref>, and the impact of code optimization on type analysis has not been well studied. We discovered that many GEP instructions in optimized code (e.g., O2 optimization level), which is often preferred by static analysis techniques for its advantages on data-flow analysis <ref type="bibr">[52,</ref><ref type="bibr">60,</ref><ref type="bibr">69]</ref>, would lose crucial type information. For example, the GEP instruction above would be transformed into %ptr = getelementptr inbounds i8, i8* %0, i64 28 after optimization. The struct pointer type is replaced by a primitive pointer type i8*. The field index is replaced by a byte offset based on the primitive pointer. In this scenario, existing methods are unable to retrieve the nested struct types or fields.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2.3">Causes of Broken Types</head><p>We argue that the above abnormal cases should not be regarded as mere bugs, since they do not cause any observable runtime errors in our experiments. The optimization that eliminates type information in &#167;2.2.2 is justified, as it simplifies the address calculation and improves the runtime performance. Previous work also shows that types in LLVM IRs are not always consistent with the source code (i.e., cases in &#167;2.2.1) for the sake of optimal machine code generation <ref type="bibr">[36]</ref>. However, this poses a challenge for researchers who want to perform type-based analysis on LLVM IRs. Therefore, instead of modifying LLVM's type system directly, we propose a program analysis oriented type recovery approach to address the issues caused by broken types.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.3">Type Recovery and Co-Analysis</head><p>As discussed in &#167;2.2, broken types can exhibit various patterns, but they all share a common problem: the lack of type information. To address this challenge, we present a novel type recovery technique ( &#167;4.1) that can infer the missing type information for broken types, enabling more sound and precise type-based analyses (e.g., icall analysis).</p><p>The icall in Figure <ref type="figure">1</ref> has only one possible target: address_taken_func1(), which cannot be determined by type analysis alone. However, if we can leverage additional data-flow analysis to identify that the icall only involves variable s1, we can reduce the set of potential targets to address_taken_func1() and address_taken_func2(). By combining the results of two-layer type matching and dataflow analysis, we can achieve the exact ground-truth result in this example. This observation inspires us to design and implement TFA.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Overview</head><p>The goal of TFA is to resolve icall targets soundly and precisely. The key idea of TFA is to combine multi-layer type analysis and data-flow analysis in an iterative manner to refine the control-flow graph (CFG) progressively. The overview of TFA is shown in Figure <ref type="figure">4</ref>, which consists of two analysis phases.</p><p>In the first phase, TFA takes the LLVM IR files as inputs and performs type recovery on them. TFA scans all struct variables and GEP instructions of the program in the LLVM IRs to detect and fix broken types. Specifically, TFA leverages the source code information and recovers the missing type information layer by layer. Based on the recovered type information, TFA then conducts multi-layer type analysis to generate an initial CFG of the program, which enables interprocedural data-flow analysis in the next phase. TFA also records the detailed analysis results of each icall in this phase to guide the subsequent data-flow analysis.</p><p>In the second phase, TFA refines the icall targets iteratively using a two-dimensional data-flow analysis. This analysis establishes the data-flow relationships between the icalls and their possible targets, and also extracts layered type information from inter-procedural context. These two analysis flows enhance the icall analysis directly and indirectly, respectively.</p><p>The icall target analysis produces a refined CFG in each round, and stops when the CFGs of two consecutive rounds are sufficiently similar. The final CFG is then stored in a database to facilitate further static analysis.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">System Design</head><p>In this section, we first introduce our type recovery approach, which aims to improve existing type analysis. Next, we present the customized alias analysis method we used. Finally, we introduce the two-dimensional data-flow analysis of TFA, which cooperates with the type analysis on refining icall targets.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Env Preparation Type Recovery</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Mlti-layer Type Analysis</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Indirect-call analysis results</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Icall Analysis Results</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Mlti-layer Type Info</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Global Call Graph LLVM IRs</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Indirect-Call Target Analysis Two-Dimentional Data-Flow Analysis</head><p>-Icall Data-flow Relation Analysis -Icall-Oriented Type Mining Mlti-layer Type Analysis Converging? N Y Broken Type Checking Address-taken funcs Indirect-calls Outer-layer types Undecidable pointers Indirect-calls Address-taken funcs CFI Bug detection ... Figure 4: The overview of TFA. It has a preparation phase and an iterative icall analysis phase. It takes the LLVM IRs as inputs and outputs the icall analysis results to a local database. Broken Struct Types Optimized GEP Instructions Iterative Struct Field Recovery Struct Field Index Recovery Recovered Type Storage Type Name Recovery </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1">Type Recovery</head><p>Type recovery aims to recover the missing struct type information to support precise and sound type analysis. The overview of type recovery is shown in Figure <ref type="figure">5</ref>. It takes two types of inputs that have incomplete or distorted type information: (1) the broken struct types (i.e., cases in &#167;2.2.1), and (2) the optimized LLVM GEP instructions (i.e., cases in &#167;2.2.2). For the broken struct types, TFA will reconstruct the erased struct type names and fields. For the optimized GEP instructions, TFA will infer the struct types and field indices from the optimized types (i.e., i8*) and byte offsets.</p><p>For both types of inputs, the first step of type recovery is to recover the corresponding struct type names. For the broken struct types, we then recursively examine and restore their possible broken field types. For the optimized GEP instructions, we additionally recover the field indices that the instructions originally accessed. TFA maintains a global map to store the recovered types for facilitating subsequent type-based static analysis in LLVM.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.1">Type Name Recovery</head><p>We present a case study from the Linux kernel to demonstrate the process of type name recovery, as shown in Figure <ref type="figure">6</ref>. This case involves a global variable omap_rtc_driver with a struct type named platform_driver. Ideally, its representation in LLVM IR should include three essential elements: struct name, struct definition, and initializer. However, its struct name is absent. We examined the entire Linux kernel and discovered 64,457 global struct variables (17.8%) that lacked their struct type names in the definition.</p><p>Our approach is to leverage the source code information (stored in the LLVM metadata nodes) to recover the missing struct name. For instance, in Figure <ref type="figure">6</ref>, by tracking the metadata layer by layer (!4378&#8594;!4379&#8594;!4380), we can identify the type name of omap_rtc_driver and construct a complete struct type in LLVM IRs.</p><p>For an optimized LLVM GEP instruction, we first obtain the pointer operand of this instruction (e.g., i8* %0 in the example of &#167;2.2.2). We then inspect the use-chain of this value to locate the llvm.dbg.value instruction, which records the corresponding source code information. Finally, we extract the metadata node from this instruction and recover the type name of this value (Struct.S in the example of &#167;2.2.2) using the same process as in Figure <ref type="figure">6</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.2">Iterative Struct Field Recovery</head><p>Struct types that contain fields of other struct types are frequently used in programming. One challenge of the type recovery is that the broken struct types could propagate through nested struct fields, and we need to recover all possible broken field types accurately. For instance, if the name of a struct type is missing, its nested struct fields are also likely to have missing names. Usually, the incorrect types are mixed with correct types, which complicates the recovery process.</p><p>To address this problem, we propose the following recovery methods for a given broken struct type. We first recover the missing name with the workflow in &#167;4.1.1. We then search the type definition list of the current module and find the intact struct definition (i.e., identified struct type) of the broken type based on the recovered name. Finally, we iteratively match all fields of the broken and intact struct types to recover all potential broken field types. In the matching procedural, we apply three additional strategies to ensure soundness: <ref type="bibr">(1)</ref> We recursively unfold all composite types into primitive types to enable sound and effective type comparison. <ref type="bibr">(2)</ref> We split all integer types into multiple i8 types according to their bitwidth. (3) We conservatively allow void pointer type to cast to any function pointer type.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.3">Struct Field Index Recovery</head><p>For an optimized GEP instruction, we need to additionally recover the exact struct field index that the GEP instruction intends to access. Since we have recovered the original struct type names, we can use the LLVM API <ref type="foot">1</ref> to obtain the field index by computing the byte offset. However, this approach is not always reliable. We manually examined the corresponding LLVM IRs and discovered that the code optimization pipline could merge multiple GEP instructions. For instance, given a C code snippet s = a-&gt;b-&gt;c, LLVM would normally generate two GEP instructions to get the address of variable s: one GEP for computing the address of b from a, and another for computing the address of c from b. However, these two GEP instructions could be combined into one during the code optimization, which sums up the two address offsets and loses the information of struct b. Consequently, existing methods fail to analyze the struct accessing behaviors correctly.</p><p>To address this problem, we propose an algorithm to identify the types and indices for the merged fields, as illustrated in Algorithm 1. The basic idea of our algorithm is to recursively compute the address offset layer by layer until it matches the given offset. In particular, when we encounter a struct field (line 5), we find the nearest field index according to the current offset (line 6) and compute the delta between this field's offset and the given offset (line 7-8). If the delta is non-zero, then the GEP must access a subfield of the current struct field, and we use the delta as the new offset to locate the subfield. When we encounter a struct array field (line 12-14), we simply descend into the element struct type and use the remainder of the given offset and the element type size to locate the subfield (line 15-17). indexList: The index list of the fields in ST List</p><p>f set; 9 ST List.PUSH_BACK(CST ); 10 indexList.PUSH_BACK(index); 11 CST &#8592; CST.GETELEMENTTYPE(index); 12 else if CST is Array Type then 13 eleTy &#8592; GETELEMENTTYPE(CST ); 14 if eleTy is Struct Type then 15 type_size &#8592; GETTYPESIZE(eleTy); 16 CST &#8592; eleTy; 17 o f f set &#8592; o f f set mod type_size; 18 end 19 else 20 return NULL, NULL; 21 end 22 return ST List, indexList;</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2">Alias Analysis</head><p>To perform data-flow analysis of TFA, we need to track alias relationships among program variables. We adopt a data structure called alias graph <ref type="bibr">[35,</ref><ref type="bibr">38]</ref> to store these relationships. The alias analysis method is inter-procedural, flow-and fieldsensitive, and it achieves a good trade-off between efficiency and precision for large-scale programs. In this paper, we enhance the soundness of the alias analysis for icall analysis scenario by making the following modifications: Flow-insensitivity. Flow-sensitivity is a property of alias analysis that takes control-flow information into account. For our analysis, we require both forward and backward data-flow exploring, which makes the flow-sensitivity irrelevant. Therefore, we use a flow-insensitive approach that simplifies the computation and improves the soundness of our analysis. Field-insensitivity. Field-sensitivity is the ability to distinguish different fields of the same composite object. We do not support field-sensitivity in alias analysis for two reasons:</p><p>(1) Our analysis result is the intersection of data-flow analysis and multi-layer type analysis, which already provides fieldsensitivity. <ref type="bibr">(2)</ref> The Linux kernel has a mechanism, called container_of, to obtain the starting address of a struct value from its field value. This mechanism introduces soundness issue <ref type="bibr">[41]</ref> because it prevents us from getting the correct outer-layer struct type (which is missing) or the correct index of the current field (which is negative). Supporting May-Alias Query. Current design of the alias graph based alias analysis mainly supports must-alias query <ref type="bibr">[35]</ref>, which will introduce soundness issues in icall analysis. Therefore, we re-design the graph updating algorithms to support may-alias query.</p><p>We have reimplemented the alias analysis according to the above requirements. Our analysis is integrated with a dataflow analysis framework that performs alias and data-flow analysis simultaneously. Whenever a new instruction is processed by the data-flow analysis, the alias analysis updates the global alias graph accordingly. We provide more details about the alias graph structure and the graph updating algorithm in &#167;A.1 in the Appendix.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3">Two-Dimensional Data-Flow Analysis</head><p>Previous work has tried to utilize data-flow analysis or pointto analysis to resolve icall targets <ref type="bibr">[30,</ref><ref type="bibr">49,</ref><ref type="bibr">55,</ref><ref type="bibr">61]</ref>. These techniques attempt to establish the data-flow or alias relationships between icalls and address-taken functions directly. However, for large-scale programs (e.g., OS kernels), these relationships can be very complex and intricate, which leads to inefficiency, imprecision, and unsoundness of existing dataflow based analysis methods <ref type="bibr">[29,</ref><ref type="bibr">42]</ref>.</p><p>Our key observation is that existing multi-layer type analysis can effectively reduce the false positives of icall analysis by exploiting the rich layered struct type information, but such information is often obscured by complex data-flows. Therefore, improving icall analysis does not necessarily require a complete data-flow analysis between icalls and their targets, but a lightweight type information mining. Based on this insight, we present a two-dimensional data-flow analysis to improve icall analysis. Specifically, it aims to analyze not only the direct data-flow relationships between icalls and their targets, but also icall-related layered struct type information across inter-procedural contexts.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3.1">Icall Data-Flow Relation Analysis</head><p>Bidirectional Data-Flow Analysis. To obtain rich data-flow information, TFA analyzes the data-flows of icall traces interprocedurally from two directions, i.e., bidirectional data-flow analysis. (1) A backward data-flow analysis that starts from an icall and identifies the possible address-taken functions that can be its targets. After completing the backward analysis, TFA calculates the intersection of the alias set of the icall and the result of multi-layer type analysis to obtain the final icall targets. (2) A forward data-flow analysis that starts from an address-taken function and determines which icalls can have it as one of their targets. For icalls that are not in the alias set of the address-taken function, TFA can safely exclude the address-taken function from their target sets.</p><p>Fall Back Strategies. Occasionally, data-flow analysis may confront cases that compromise analysis soundness and result in false negatives. To cope with this problem, we have introduced several early termination conditions designed to preserve soundness while balancing effectiveness and efficiency. The data-flow analysis halts and the results are conservatively reverted to type analysis upon triggering any of the specified conditions:</p><p>&#8226; The number of the aliased value reaches the threshold.</p><p>TFA allows users to specify a maximum number (500 by default) of values that are aliased with the input value (e.g., an icall pointer). TFA terminates the data-flow analysis when the number of aliased values reaches the limit and falls back to type analysis.</p><p>&#8226; The input value is aliased with assembly code. Assembly code is currently out of the scope of our analysis. Therefore, when the input value is aliased with assembly calls, TFA terminates the data-flow analysis and falls back to type analysis. In addition, TFA also stops the data-flow analysis when the subgraph includes global variables named llvm.compiler.used (llvm.used in LLVM 14 and earlier versions). Usually, this could happen when the program initializes and updates global variables through assembly code (e.g., Linux static calls).</p><p>&#8226; The input value is aliased with arithmetic operated values. A pointer value may be modified by arithmetic operations in practice. Static analysis of its alias set is still an open challenge, and the soundness of the results is hard to guarantee. Therefore, TFA terminates the data-flow exploration in this case and falls back to type analysis.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3.2">Icall-Oriented Type Mining</head><p>The state-of-the-art multi-layer type analysis achieves more precise icall analysis by utilizing richer layered struct type information. However, in large programs (e.g., OS kernels), the storage relationships between structural types and icalls could be complex (e.g., involves inter-procedural data-flows). As a result, existing methods fail to resolve these cases, which limits the effectiveness of MLTA. TFA aims to resolve this problem through data-flow based type mining. However, type mining can be costly and impractical for programs with a large number of structural types. To address this problem, we present icall-oriented type mining, a technique that focuses on tracking and extracting type information that is relevant to indirect call analysis from two perspectives. Type Mining for Icalls. We found that many indirect function calls with nested type information are not handled correctly in multi-layer type analysis. A common case in the Linux kernel is illustrated in Figure <ref type="figure">7</ref>. Multi-layer type analysis assumes that the indirect call at line 4 lacks any enclosing struct type information and resorts to one-layer type analysis.</p><p>1 /* sound/core/pcm_lib.c */ 2 static int interleaved_copy(..., pcm_transfer_f transfer) { 3 ... 4 return transfer(...); //No outer-layer type in MLTA 5 } 6 7 snd_pcm_sframes_t __snd_pcm_lib_xfer(...) { 8 ... 9 pcm_copy_f writer; 10 pcm_transfer_f transfer; However, there is a caller of interleaved_copy() at line 16, which passes a function pointer transfer as an argument. The origin of transfer reveals that it does have an enclosing type substream-&gt;ops-&gt;copy_kernel at line 16.</p><p>To make these icalls benefit from multi-layer type analysis, we perform a backward data-flow analysis starting from them and track their origins. TFA conducts an inter-procedural analysis to determine if an icall is derived from a struct field, which indicates aliasing with a struct field. Unlike the previous data-flow analysis, this phase terminates early when an aliased value with nested type information (i.e., GEP instructions in LLVM IRs) is encountered during data-flow propagation. Type Mining for Undecidable Targets. Multi-layer type analysis requires recording the confined targets for a struct type. However, this could be infeasible when a struct field has an undecidable target (i.e., type-escaping <ref type="bibr">[42]</ref>). Figure <ref type="figure">8</ref> illustrates a typical case in the Linux kernel. A function pointer is assigned to the detach field of the struct pointer ap at line 7, which has the type struct drm_aperture. Since the function pointer originates from the argument (detach at line 3), its potential identities cannot be determined by type analysis alone. Multi-layer type analysis maintains a global map to record the fields of structs that encounter type-escaping (e.g., drm_aperture-&gt;detach in this case). When an indirect call has an escaped outer-layer type (e.g., the indirect call at line 16), multi-layer type analysis will fall back to one-layer type matching conservatively to ensure soundness.</p><p>TFA addresses this problem through backward data-flow tracking. In this example, we choose the function pointer whose source is undecidable (i.e., detach at line 3) as the starting point of data-flow analysis. TFA precisely identifies the origin of the input pointer (drm_aperture_detach_firmware() at line <ref type="bibr">22)</ref>. This allows us to eliminate the escaped layered types (drm_aperture-&gt;detach) from the global escaping map. As a result, more icalls could benefit from MLTA. Mutual Enhancement. Our type mining approach for indirect calls has a remarkable feature: its two analysis components can mutually enhance each other in the iterative data-1 /* drivers/gpu/drm/drm_aperture.c */ 2 static int devm_aperture_acquire(struct drm_device *dev, ...</p><p>3 void (*detach)(struct drm_device *)) { 4 ... 5 struct drm_aperture *ap; 6 ...  flow analysis process. The type mining for indirect calls enables more indirect calls to benefit from the multi-level type analysis, but it also generates more undecidable targets. The type mining for undecidable targets effectively addresses this issue. This forms a positive feedback loop that improves the precision of indirect call resolution.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">TFA Implementation</head><p>We have implemented TFA on LLVM of version 15-init as several analysis passes, which contains 12k lines of C++ code (including 1.7k LoC of the implementation<ref type="foot">foot_1</ref> of TypeDive <ref type="bibr">[42]</ref> for multi-layer type analysis). It accepts unlinked LLVM bitcode files as inputs, and outputs the analysis results into a local MYSQL database.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.1">LLVM Bitcode Generation</head><p>We use an independent LLVM pass for bitcode generation, which uses LLVM API WriteBitcodeToFile() to dump bitcode files. We modify the Makefiles of our target programs to load this pass. We add -g flag into the CFLAGS of Makefiles to retain debug information for type recovery and icall information storage. Additionally, we have observed instances where the same struct type is compiled as a packed struct in certain modules and as a normal struct elsewhere, resulting in disparate struct layouts that affect the type recovery in TFA.</p><p>To resolve this inconsistency, we employ the Clang compilation option -mms-bitfields to ensure uniform struct layouts across different modules.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.2">Type Comparison</head><p>In TFA, we use struct type names to compare different struct types. For struct types with the same name in different modules, we include the struct's definition file as part of the struct name. The missed type names would be recovered through type recovery. However, developers could define struct or union types without names in C/C++. LLVM will automatically generate type names for them (e.g., %union.anon.157, %struct.anon.8). The type recovery of TFA conservatively regards them as invalid because different LLVM modules could generate different names for the same nameless struct field. This affects the type analysis when a function pointer is stored in a nameless field. In such cases, we fall back to one-layer type matching, which means that any icall that has the same function type as the function pointer can potentially target it. Fortunately, this limitation only affects 270 (0.4%) address-taken functions in the Linux kernel.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.3">Multi-Layer Type Analysis</head><p>We adopt the implementation of existing state-of-the-art multilayer type analysis as a part of TFA with necessary modifications. Specifically, we use the implementation 3 of TypeDive <ref type="bibr">[42]</ref> with the following modifications: (1) We set the outer-layer number as two, so TypeDive executes twolayer type matching for icall analysis. Considering that we need to iteratively execute multi-layer type analysis, two-layer type matching is the most cost-effective type analysis for us.</p><p>(2) We rewrite its type recording system to apply our type recovery approach.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.4">Kernel Macro Optimization</head><p>The Linux kernel macro EXPORT_SYMBOL is frequently utilized to make functions from one kernel module available in other modules. Theoretically, it should not affect the analysis of icalls. However, the function pointer associated with this macro emits an LLVM IR pattern identical to that of assembly code, encompassing llvm.used for data-flow propagation and potentially causing premature termination of the analysis. To mitigate this issue, we meticulously examine the source code of each address-taken function by checking the corresponding LLVM metadata. Should a function be engaged with EXPORT_SYMBOL, TFA continues the data-flow propagation without interruption.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.5">Supporting C++</head><p>In C++ programs, virtual function calls constitute a significant proportion of icalls. According to our manual analysis, such icalls are often devoid of sufficient outer layer type information. Furthermore, the analysis of data-flows associated 3 <ref type="url">https://github.com/umnsec/mlta</ref> with these icalls tends to be resource-intensive. Consequently, existing approaches for type or data-flow analysis tend to be ineffective for such icalls. To address this challenge, we employ the strategy outlined in TypeDive <ref type="bibr">[42]</ref>. Specifically, we analyze the constructors of all C++ classes to catalog their virtual function tables (VTables). Upon the invocation of a virtual function call, we first identify the class issuing the call and subsequently pinpoint the precise callee function by referencing the corresponding VTable.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.6">Convergence Checking</head><p>Previous works <ref type="bibr">[30,</ref><ref type="bibr">42,</ref><ref type="bibr">68]</ref> evaluate the effectiveness of icall analysis based on the number of icall targets. In this paper, we adopt the same metric to compare the effectiveness of different analysis rounds of TFA. We consider the analysis to have reached convergence and terminate the analysis when the difference of icall targets between adjacent analysis rounds is less than 10,000.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6">Evaluation</head><p>We aim to answer the following research questions (RQs) in our evaluation.</p><p>&#8226; RQ1:Compared to existing methods, how effective and efficient is TFA in refining icall targets?</p><p>&#8226; RQ2: Could TFA guarantee the soundness after introducing data-flow analysis?</p><p>&#8226; RQ3: How do broken types affect icall analysis? Moreover, how effective is the type recovery in addressing this challenge?</p><p>&#8226; RQ4: Does TFA benefit real-world applications?</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.1">Experimental Setup</head><p>We evaluate TFA on top of a Linux server (Ubuntu 20.04.1) with 126GB RAM and an Intel Xeon Silver 4316 CPU at 2.30GHz (80 cores). We choose five widely used programs as the analysis targets: the Linux kernel (v5.18), the FreeBSD kernel (v12.4), the OpenSSL library (v3.0.6), the OpenCV library (v4.9.0), and MongoDB (v8.0.0). Additionally, for the Linux kernel, we built it with two different configurations: the allyesconfig to evaluate the scalability of TFA, and the localmodeconfig to emulate the most common kernel usage scenario.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.2">Performance on Eliminating Icall Targets</head><p>TFA aims to eliminate false positives for icall analysis and provide a more fine-grained icall target result. Similar to existing icall analysis works <ref type="bibr">[30,</ref><ref type="bibr">42,</ref><ref type="bibr">68]</ref>, we use the average icall target number to evaluate the performance of TFA on eliminating icall targets. We leverage the OpenMP library <ref type="bibr">[15]</ref> to enable multi-core parallel acceleration for TFA, with a concurrency of 32 threads in our evaluation. Even for the Linux kernel, which comprises nearly 28 million lines of code, TFA could complete the entire icall analysis in two hours. This result is promising, given the complexity of the analysis task.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.2.1">Performance Comparison with Existing Tools</head><p>Comparison with Type Analysis. We present the comparison of three icall target analysis methods in Table <ref type="table">1</ref>: signature (the most commonly used method), MLTA (the state-of-the-art method), and TFA. MLTA effectively reduces false positives compared with the signature matching method. In MLTA's paper <ref type="bibr">[42]</ref>, only icalls that have multi-layer type information are included in the evaluation. Therefore, it reports a higher precision. In comparison, our evaluation includes all icalls.</p><p>The results indicate that TFA could further eliminate 24% to 59% of icall targets compared with state-of-the-art type-based MLTA method.</p><p>For C++ programs, multi-layer type analysis is suboptimal in eliminating redundant icall targets. The virtual function analysis method presented in &#167;5.5 demonstrates superior performance. Even when both virtual function analysis and MLTA are applied, our system, TFA, succeeds in eliminating 44% to 56% of icall targets in C++ programs, thereby confirming its effectiveness. Comparison with Program Modularization. We further compare TFA with TyPM <ref type="bibr">[41]</ref>, a program modularization based icall analysis approach. The basic idea of TyPM is to refine icall analysis by performing type matching locally within modules that have type dependencies, rather than globally as in existing techniques. We evaluate TyPM on the Linux kernel compiled with the localmodeconfig and measure its effectiveness in reducing icall targets. TyPM further eliminates 19.8% of icall targets compared with the results of MLTA. On comparison, TFA eliminates additional 55.4% of icall targets compared with MLTA for the same program, demonstrating the benefits of type and data-flow co-analysis of TFA. Comparison with Abstract Interpretation. Abstract interpretation is a technique that projects a program's behavior into an abstract domain, systematically over-approximating the possible states of the program, and by extension, the potential targets of icalls. We investigated two representative tools: Frama-C <ref type="bibr">[18]</ref> and Ikos <ref type="bibr">[25]</ref>. Frama-C actually necessitates developers to explicitly annotate potential icall targets via specialized comments, which implies that precise and automated icall analysis remains unaddressed by this tool. Furthermore, an attempt to apply Ikos to the Linux kernel did not culminate within 12 hours, and a significant portion of the analysis was marred by substantial runtime errors arising from unsupported LLVM types, affecting approximately 40% of the modules.</p><p>These observations indicate that for extensive codebases, abstract interpretation-based methods for icall analysis may lack scalability. TFA, in contrast, aims to provide a more feasible solution.</p><p>Comparison with Other Co-Analysis Method. KELP <ref type="bibr">[26]</ref> shares a similar observation with TFA: a multitude of function pointers in icall analysis are straightforward and could be resolved through affordable data-flow analysis. KELP conducts localized pointer analysis for such functions and integrates this analysis with MLTA. As the source code of KELP was unavailable at the time of this paper's writing, and its icall analysis metrics appear to differ from those used by TFA, a high-level comparison is provided here for clarity and completeness. The primary distinctions in design between KELP and TFA are twofold: (1) KELP makes data-flow analysis as a separate preliminary stage prior to MLTA, whereas TFA integrates data-flow analysis and type analysis into a unified co-analysis framework, allowing for mutual enhancement. (2) KELP is designed for a single execution with an emphasis on efficiency, whereas TFA employs an iterative analysis to achieve greater accuracy.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.2.2">Performance of Analysis Rounds and Phases</head><p>In this subsection, we conduct an in-depth evaluation of the performance across different analysis rounds within our twodimensional data-flow analysis, as well as its distinct phases. Due to page constraints, we have selected the Linux kernel (kernel space C program), OpenSSL (user space C program), and MongoDB (user space C++ program) as representative targets for evaluation.</p><p>Analysis Rounds. Table <ref type="table">2</ref> shows the numbers of the total icall targets that remain after each analysis round of TFA. The two-dimensional data-flow analysis utilizes the results of MLTA as its initial inputs. For MongoDB, the initial results incorporate virtual function analysis subsequent to MLTA. We found that the first analysis round eliminates the most icall targets. The convergence speed of our analysis depends on the size and complexity of the target software. Smaller software could reach convergence faster than larger software. However, even for OS kernel-level software, our analysis converges within three rounds. This demonstrates the effectiveness and efficiency of TFA for resolving indirect calls in real-world software.</p><p>Analysis Phases. We studied the impact of each analysis phase in our two-dimensional data-flow analysis, as presented in Table <ref type="table">3</ref>. The bidirectional data-flow analysis has the most significant effects on eliminating the number of icall targets for all the three programs. The two components of icalloriented type mining also contribute substantially to the precision of our analysis. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.3">False Negative Analysis</head><p>One important concern for TFA is to ensure the soundness of the data-flow analysis results. In this subsection, we present the false negative analysis of TFA and the method we used to collect icall traces as the ground-truth for evaluation. We selected the Linux kernel and OpenSSL library as the evaluation targets in this section.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.3.1">Trace Collection</head><p>We collect icall traces through LLVM instrumentation for the Linux kernel. Specifically, we compile the kernel with Clang and insert a hook function that records the icall information before each icall site. This function logs the source location and the callee name of each icall to the system log. We implement the function in the kernel source code and make it globally visible. It utilizes the kernel API sprint_symbol() to retrieve the callee names from their addresses. We also filter out any duplicate traces before logging them to simplify the subsequent trace analysis. To obtain a sufficient number of icall traces, we run the Linux Test Project [12] on the Linux kernel. We finally collect 6,929 unique traces. Our instrumentation is implemented as an LLVM pass with 230 lines of C++ code. The tracing strategy of OpenSSL is similar to the OS kernels, where we also instrument the program to collect traces. To obtain the name of the callee function from the symbol table, we initially attempted to use the Linux API dladdr(), but this API often failed and returned empty results. Hence, we leveraged the LLVM prefix data to identify callees. In particular, we assigned a fixed-size function name as the prefix data for every address-taken function, which could be accessed with a fixed offset from the function's entry point. We compiled the hook function as a static library and linked it to OpenSSL by modifying its Makefile. We stored the icall traces in a local MYSQL database. We used the openssl speed [16] benchmark to collect icall traces and obtained 851 unique traces in total.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.3.2">Results for False Negative Analysis.</head><p>We apply two criteria to select the icall traces that are valid for our analysis. First, we discard traces that do not have valid callee names, which indicates that the symbols are not resolved correctly. Second, we exclude traces that have mismatched icall or callee locations with the source code, which implies that the binary code is not consistent with the source code. After applying these filters, we obtain 6,452 unique and valid icall traces from the Linux kernel and 683 traces from the OpenSSL library. FN Results of the Linux Kernel. TFA only misses two callees in our analysis, which come from two icalls in function __apply_relocate_add(). Specifically, these two icalls miss the same callee: __memcpy(). The reason for the two false negatives is that our type analysis phase fails to identify the callee. We examine the whole kernel code and discover that the address of __memcpy() is never assigned to any pointer (i.e., it is not an address-taken function). There are only a few direct calls to this function. Therefore, __memcpy() is not considered as a possible icall target by our type analysis, even when we use signature matching. We also investigate the icall target sets produced by TFA, and we notice a wrapper function of __memcpy(): memcpy(). We suspect that some compiler optimizations replace memcpy() with __memcpy() and cause this false negative. FN Results of the OpenSSL Library. We identified 58 false negatives in OpenSSL in total. We investigate these cases and find all of them are missing since type analysis (signature matching). In particular, the OpenSSL tends to define icalls with general parameter types (e.g., void *). However, the actual parameters could cast to other types (e.g., MD5_CTX *). This is a common challenge for many existing icall analysis works <ref type="bibr">[30,</ref><ref type="bibr">42,</ref><ref type="bibr">58]</ref>, and leads to false negatives even for signature matching. One feasible solution for this problem is conservatively equating void pointers with any other pointer types. We applied this approach to TFA and reevaluated its performance on the OpenSSL library. The average number of icall targets for both signature matching and MLTA increased significantly (63.6%&#8593; and 60.2%&#8593;). However, the impact on TFA was relatively small (12.1%&#8593;). Moreover, this approach eliminated all false negatives for TFA.</p><p>Finding 1: The data-flow analysis of TFA does not introduce more false negatives than existing type-based analysis methods. TFA performs better when we adopt more conservative type analysis methods.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.4">Effectiveness of Type Recovery</head><p>To better understand how broken types impact type-based analysis, in this subsection, we use the traces collected from the Linux kernel to check the effectiveness of our type recovery.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.4.1">Broken Struct Type Recovery</head><p>To evaluate the impact of broken struct types in &#167;2.2.1, we build a Linux kernel with O0 optimization and turn off the type recovery feature in TFA. We then reexamine the icall analysis outcomes. The number of false negatives for the Linux kernel rises to 879 (13.6%). Figure <ref type="figure">9</ref> illustrates a typical false negative case in the Linux kernel, where the callee con_write_room() is omitted for the icall at line 4. The omitted callee is assigned to the write_room field of struct type tty_operations at line 13. Ideally, when the icall is generated from the same field of the same struct type (i.e., the ops-&gt;write_room at line 4), the callee should have been detected as a possible target. However, the representations of the struct type tty_operations are inconsistent in the two LLVM bitcode files, where one of them represents two function pointers' types as {}*. Consequently, type analysis considers them as distinct struct types, which results in false negatives. By  applying type recovery, we can eliminate all false negatives caused by broken struct types.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.4.2">Optimized GEP Instruction Recovery</head><p>To evaluate the impact of optimized GEP instructions, we build a Linux kernel with O2 optimization and disable the GEP instruction recovery in TFA. We repeat the icall analysis results and observe that the number of false negatives rises to 80. The main sources of false negatives are imprecise storage analysis of address-taken functions and nested struct fields, which require accurate identification of nested struct types and field indices from GEP instructions. By applying type recovery, we successfully eliminate 77 false negatives. There is one additional false negative compared to the previous experiment due to a GEP instruction that performs an implicit type casting. Since this scenario is rare, we leave it as future work.</p><p>Finding 2: The broken types significantly influence the soundness of type analysis. The type recovery in TFA effectively prevents false negatives introduced by broken types.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.5">Application I: Fine-Grained CFI</head><p>We implemented a forward-edge CFI scheme that leverages our fine-grained icall analysis results, which is built based on the KCFI sanitizer <ref type="bibr">[10,</ref><ref type="bibr">11]</ref>. It prevents control-flow hijacking attacks by disallowing changes to the pre-defined CFG. Currently, our CFI scheme is implemented on LLVM IRs in Clang compiler to support various backends. We load the valid target sets from the database during compilation and encode them as read-only data. We utilize LLVM prefix data to determine which function will be called before we emit the icall. We insert a verification function before each icall to check if the callee belongs to the corresponding target set. If an invalid function is invoked, the CFI scheme redirects the control flow to a user-defined error-handling function for further debugging. Our scheme is compatible with any forward-edge CFI scheme that allows specifying a target set for each independent icall (e.g., Clang CFI <ref type="bibr">[58]</ref>).</p><p>We measured the performance of our forward-edge CFI scheme on a Ubuntu virtual machine with 4 CPUs (Intel Core i7-8770 CPU, 3.20Ghz). We used the UnixBench 5.1.2 [17] as the benchmark, which offers a diverse range of workloads and is widely used in CFI evaluation <ref type="bibr">[30,</ref><ref type="bibr">37]</ref>. Compared with the original system without CFI protection, the average system performance decreased only by 4.1% on average. The detailed overheads of each task are shown in Figure <ref type="figure">10</ref>. The evaluation result confirms that our fine-grained CFI scheme is feasible in practice. Notably, we observed negative overheads, a phenomenon also reported in other CFI research <ref type="bibr">[30]</ref>. This improvement may be attributed to the CFI instrumentation altering the code layout, resulting in enhanced code alignment or more efficient utilization of the instruction cache, thereby accelerating context switches. The overhead of the CFI could be further optimized through architecture-specific (e.g., x86) implementation, which is mainly an engineering effort. We leave this part as further work.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.6">Application II: Bug Detection</head><p>One of the potential applications of icall analysis is static bug detection, which requires a precise CFG for inter-procedural analysis. We leverage our fine-grained icall analysis to analyze the Linux kernel, focusing on a specific type of icalls: the release call-back functions in devices. These functions are responsible for resource cleanup when a device is no longer in use. They are controlled by reference counting and assigned by the device allocators. However, the device allocators may also have their own error handling code for resource cleanup, which could conflict with the release call-back functions. This scenario involves three challenging mechanisms for static analysis: indirect-call, reference counting, and error handling. Existing methods typically overlook this scenario and only address one of these mechanisms <ref type="bibr">[33,</ref><ref type="bibr">34,</ref><ref type="bibr">43,</ref><ref type="bibr">46,</ref><ref type="bibr">56,</ref><ref type="bibr">57,</ref><ref type="bibr">65]</ref>. TFA Assisted Trace Analysis. To understand the triggering mechanism of the release callback, we perform a trace analysis in the first step. We use the icall analysis results generated by TFA and trace backward from a callback function. The trace consists of six function calls originating from put_device(), among which three are indirect calls. TFA accurately determines that the three indirect calls have only one caller each. With TFA's assistance, a non-expert researcher can complete this task in 15 minutes. If we use type analysis instead of TFA's co-analysis for icall target identification, there would be 64 possible callers of the release callback, and we would have to manually examine all these callers and their transitive callers. Bug Detection. From the trace analysis, we identify that the release callback is invoked by put_device() after the device reference count reaches zero. In this step, we use TFA to detect two types of bugs in the release callbacks: (1) redundant cleanup, which may lead to use-after-free or double-free vulnerabilities, and (2) missing resource release, which may cause memory leaks. We compared the release callback code with the error handling code after put_device() in the device allocators to detect these bugs. Out of 243 release callbacks generated by co-analysis, TFA reported nine bugs. We manually verified them and found eight real bugs, including six double-free bugs and two memory-leak bugs, as shown in Table <ref type="table">4</ref>. Three of them have been fixed by other developers in the latest kernel. We submitted patches for the remaining bugs and four of them have been applied. If we use type analysis instead of co-analysis, we would have 671 potential release callbacks, and 63.8% of them would be false positives, which would make the bug detection ineffective. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7">Discussion</head><p>Assembly Code Analysis. At present, TFA does not have the capability to analyze assembly code. When encountering assembly code, we adopt a conservative approach and terminate the data-flow analysis, falling back to type analysis. Prior works <ref type="bibr">[30,</ref><ref type="bibr">37]</ref> rely on manual analysis to resolve indirect calls and jumps in assembly code. We manually checked some failure cases due to assembly calls in the Linux kernel and found many of them are introduced by a new feature called Linux static call. We plan to design a dedicated analysis pass for this feature in our future work. Type Analysis in LLVM. In addition to the broken types discussed in &#167;2.2, prevalent type analysis methods, such as signature matching, also falter due to type casting. For instance, unexpected type casting can engender substantial false negatives in the analysis of icalls within the OpenSSL library.</p><p>To address this problem, Ge et al. used taint analysis to infer icall targets <ref type="bibr">[30]</ref>. IFCC <ref type="bibr">[58]</ref> used two more conservative approaches to construct icall target set: Single (which collects all functions into a single set) and Arity (which infers icall targets according to the number of arguments). FINE-CFI <ref type="bibr">[37]</ref> presented some IR-based methods to address this problem, but they were ineffective for eliminating the false negatives in our analysis. As a result, LLVM 16 introduces an opaque pointer type to replace all pointer types containing their point-to types, due to various issues with the latter <ref type="bibr">[14]</ref>. Furthermore, typed pointers have been deprecated in LLVM 17+, as the LLVM community discourages the reliance on LLVM's internal type system for conducting type-based analyses. Nevertheless, the latest LLVM releases have introduced 'tbaa' metadata <ref type="bibr">[19]</ref>, designed to represent the type system of higher-level languages and aid in type analysis. This metadata has been employed in the implementation of type-based alias analysis. We posit that this feature holds considerable promise as a solution, and we propose the examination of this new capability as an avenue for future work.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="8">Related Work</head><p>Indirect-Call Analysis for CFI Schemes. CFI schemes usually give priority to the soundness of icall analysis. Many existing CFI schemes adopt Single or Arity to construct icall targets <ref type="bibr">[58,</ref><ref type="bibr">59,</ref><ref type="bibr">66,</ref><ref type="bibr">67]</ref>. These methods do not rely on type information and can be applied to binary executables. A more precise approach is to use the types of function pointers to match icall targets [10, <ref type="bibr">48,</ref><ref type="bibr">50]</ref>, but this approach suffers from soundness degradation due to primitive type casting. Recently, some CFI schemes have leveraged the idea of MLTA to further refine icall targets <ref type="bibr">[30,</ref><ref type="bibr">37]</ref>. However, they do not address the problem of broken types, which can affect the security of CFI.</p><p>In this paper, we systematically study the causes and impacts of this problem and propose a type recovery system to solve it. Indirect-Call Analysis for Bug Detection. Icall analysis is essential for static bug detection, which relies on a global CFG to perform inter-procedural analysis. A common challenge for this task is the high false positive rate caused by inaccurate icall analysis. Therefore, some tools opt to ignore icalls altogether <ref type="bibr">[33,</ref><ref type="bibr">38,</ref><ref type="bibr">45]</ref>. Many static bug detection tools adopt one-layer type matching to resolve icalls (e.g., Deadline <ref type="bibr">[63]</ref>, LRSan <ref type="bibr">[60]</ref>, and K-MELD <ref type="bibr">[28]</ref>). When multi-layer type analysis was introduced, it was quickly applied to bug detection <ref type="bibr">[22,</ref><ref type="bibr">32,</ref><ref type="bibr">39,</ref><ref type="bibr">52,</ref><ref type="bibr">62]</ref>. Recently, directed fuzzing also demands precise icall analysis to calculate the cross-function distance <ref type="bibr">[44]</ref>. We believe TFA could benefit both existing and future static or dynamic analysis.</p><p>Type and Data-Flow Co-Analysis. Ghavamnia et al. proposed two filtering schemes to refine the results of Andersen's points-to analysis and obtain more accurate icall targets <ref type="bibr">[31]</ref>. These schemes exactly perform one-layer type matching, which do not exploit the rich layered type information for optimization. FINE-CFI <ref type="bibr">[37]</ref> introduced the concept of struct location vector to enable two-layer type analysis. However, when an icall originates from a function argument, FINE-CFI falls back to one-layer type matching. The dataflow analysis approach of TFA could effectively resolve this problem. Ge et al. applied taint analysis from address-taken functions for icall analysis <ref type="bibr">[30]</ref>. When a function pointer is assigned to a struct's field, all memory objects of this struct's field are tainted, which is similar to two-layer type matching. This method relies on two strict assumptions on function pointer operations to support its taint analysis. When a violation occurs, users have to interrupt the analysis and manually resolve it. TFA does not make any assumptions or require frequent manual intervention. TFA also improves the icall analysis through mining the hidden layered types, which is missed by existing methods.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="9">Conclusion</head><p>In this paper, we present TFA, which performs type and dataflow co-analysis to optimally resolve indirect-call targets. We also propose several techniques to enhance the scalability, soundness, and precision of TFA. We evaluate TFA on five famous C/C++ programs, and show that TFA could further eliminate 24% to 59% of indirect-call targets compared with the state-of-the-art approaches. We also apply TFA to improve the security of forward-edge CFI and the ability of static bug detection. As a generic technique, we believe that the precise indirect-call analysis of TFA can benefit various security research domains.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="10">Acknowledgment</head><p>We sincerely appreciate our shepherd and all the anonymous reviewers for their insightful comments on our work.</p><p>This work was supported by the Key R&amp;D Program of Zhejiang Province (2022C01086). Kangjie Lu was supported in part by the NSF awards CNS2045478, CNS-2106771, CNS-2154989, and CNS-2247434. Any opinions, findings, conclusions or recommendations expressed in this material are those of the author and do not necessarily reflect the views of NSF.</p></div><note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0"><p>LLVM API getElementContainingOffset(): given a valid byte offset into the struct type, it returns the struct index that contains it.</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_1"><p>https://github.com/umnsec/mlta</p></note>
		</body>
		</text>
</TEI>
