<?xml-model href='http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng' schematypens='http://relaxng.org/ns/structure/1.0'?><TEI xmlns="http://www.tei-c.org/ns/1.0">
	<teiHeader>
		<fileDesc>
			<titleStmt><title level='a'>Towards a Flow- and Path-Sensitive Information Flow Analysis</title></titleStmt>
			<publicationStmt>
				<publisher></publisher>
				<date>08/01/2017</date>
			</publicationStmt>
			<sourceDesc>
				<bibl> 
					<idno type="par_id">10077723</idno>
					<idno type="doi">10.1109/CSF.2017.17</idno>
					<title level='j'>30th IEEE Computer Security Foundations Symposium</title>
<idno></idno>
<biblScope unit="volume"></biblScope>
<biblScope unit="issue"></biblScope>					

					<author>Peixuan Li</author><author>Danfeng Zhang</author>
				</bibl>
			</sourceDesc>
		</fileDesc>
		<profileDesc>
			<abstract><ab><![CDATA[This paper investigates a flow-and path-sensitive static information flow analysis. Compared with security type systems with fixed labels, it has been shown that flow-sensitive type systems accept more secure programs. We show that an information flow analysis with fixed labels can be both flow-and path-sensitive. The novel analysis has two major components: 1) a general-purpose program transformation that removes false dataflow dependencies in a program that confuse a fixed-label type system, and 2) a fixed-label type system that allows security types to depend on path conditions. We formally prove that the proposed analysis enforces a rigorous security property: noninterference. Moreover, we show that the analysis is strictly more precise than a classic flow-sensitive type system, and it allows sound control of information flow in the presence of mutable variables without resorting to run-time mechanisms.]]></ab></abstract>
		</profileDesc>
	</teiHeader>
	<text><body xmlns="http://www.tei-c.org/ns/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>I. INTRODUCTION</head><p>Information-flow security is a promising approach to security enforcement, where the goal is to prevent disclosure of sensitive data by applications. Since Denning and Denning's seminal paper <ref type="bibr">[20]</ref>, static program analysis has been widely adopted for information-flow control <ref type="bibr">[38]</ref>. Among these program analyses, type systems (e.g., <ref type="bibr">[33]</ref>, <ref type="bibr">[36]</ref>, <ref type="bibr">[41]</ref>) have enjoyed a great popularity due to their strong end-to-end security guarantee, and their inherently compositional nature to combine secure components forming a larger secure system as long as the type signatures agree.</p><p>Conventionally, we assume secrets are stored in variables, and security levels (e.g., P for public and S for secret) are associated with variables to describe the intended secrecy of the contents. The security problem is to verify that the final value of the public variables (outputs visible to the public) is not influenced by the initial value of the secret variables.</p><p>Many security type systems (e.g., <ref type="bibr">[33]</ref>, <ref type="bibr">[36]</ref>, <ref type="bibr">[41]</ref>) assume fixed levels. That is, the security level for each variable remain unchanged throughout program execution. Though this fixed-level assumption simplifies the design of those type systems, one consequence is that they tend to be overconservative (i.e., reject secure programs). For example, given that s has a level S (i.e., s holds a secret value) and p has a level P, a fixed-level type system rejects secure programs, such as (p := s; p := 0;), even though the publicly observable final value of p is always zero.</p><p>Previous work (e.g., <ref type="bibr">[26]</ref>) observes that such inaccuracy roots from the flow-insensitive nature (i.e., the order of program execution is ignored) of fixed-level systems. From this perspective, the previous example is mistakenly considered insecure because the (impossible) execution order (p := 0; p := s; ) is insecure.</p><p>Hunt and Sands <ref type="bibr">[26]</ref> propose a classic flow-sensitive type system which allows a variable to have multiple security levels over the course of computation. For example, this floating-level type system correctly accepts the program (p := s; p := 0;) by assigning p with levels S and P after the first and second assignments respectively. However, this floating-level system is still path-insensitive, meaning that the predicates at conditional branches are ignored in the analysis. For example, it incorrectly rejects the following secure program since the (impossible) branch combination (y := s; p := y; ) is insecure.</p><p>if (x = 1) then y := 0 else y := s; if (x = 1) then p := y This paper develops a flow-and path-sensitive information flow analysis that is precise enough to accept the aforementioned secure programs. The novel analysis is built on two key observations. First, flow-sensitivity can be gained via a general-purpose program transformation that eliminates false dataflow dependencies that confuse a flow-insensitive type system. Consider the example (p := s; p := 0;) again. The transformation removes the false dataflow dependency between s and p by introducing an extra copy of the variable p and keeps track of the final copy of each variable at the same time. So, the example is transformed to (p 1 := s; p 2 := 0;), where p 2 is marked as the final copy. Then, a fixed-level system can easily type-check this program by assigning levels S and P to p 1 and p 2 respectively.</p><p>Second, path-sensitivity can be gained via consolidating dependent type theory (e.g., <ref type="bibr">[16]</ref>, <ref type="bibr">[39]</ref>, <ref type="bibr">[43]</ref>) into security labels. That is, a security label is, in general, a function from program states to security levels. Consider the second example above with branches. We can assign y a dependent security label: (x = 1?P : S), meaning that the level of y is P when x = 1, and S otherwise. Hence, the information flow from y to p can be judged as secure since it only occurs when x = 1 (hence, y has level p).</p><p>Based on the key observations, we propose a flow-and path-sensitive information flow analysis that consists of two major components: a general purpose program transformation that removes false dataflow dependencies that otherwise compromise the precision of a fixed-level system, as well as a fixed-label type system with dependent labels. Each component of our analysis targets one insensitive source of previous type systems. The modular design not only enables tunable precision of our analysis, but also sheds light on the design of security type systems: we show that a fixed-level system (e.g., <ref type="bibr">[41]</ref>) plus the program transformation is as precise as 1 the classic flow-sensitive system in <ref type="bibr">[26]</ref>; furthermore, a fixed-label dependent type system can soundly control information flow in the presence of mutable variables without resorting to run-time mechanisms (e.g., <ref type="bibr">[23]</ref>, <ref type="bibr">[45]</ref>).</p><p>This paper makes the following key contributions: 1) We formalize a novel flow-and path-sensitive information flow analysis for a simple WHILE language.</p><p>The analysis consists of a novel program transformation, which eliminates imprecision due to flow-insensitivity (Section IV), and a purely static type system using dependent security labels (Section V). 2) We formally prove the soundness of our analysis (Section VI): the source program satisfies terminationinsensitive noninterference whenever the transformed program type-checks. Novel proof techniques are required due to the extra variables introduced (for added precision) in the transformed program. 3) We show that our analysis is strictly more precise than a classic flow-sensitive type system <ref type="bibr">[26]</ref> (Section VII). One interesting consequence is that the program transformation automatically makes a sound flow-insensitive type system (e.g., <ref type="bibr">[41]</ref>) as precise as the classic flow-sensitive system <ref type="bibr">[26]</ref>. 4) We show that our dependent type system soundly controls information flow in the presence of mutable variables without resorting to dynamic mechanisms, such as the dynamic erasure mechanism in previous work <ref type="bibr">[23]</ref>, <ref type="bibr">[45]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>II. BACKGROUND AND OVERVIEW</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>A. Information Flow Analysis</head><p>We first review standard information flow terminology used in this paper. We assume all variables are associated with security levels. A security policy is specified as the ordering of the security levels, typically in the form of a security lattice. For data d 1 with security level 1 and data 1 We note that in the information flow literature, different terms (such as "precision" and "permissiveness") have been used to compare the amount of false positives of various mechanisms <ref type="bibr">[15]</ref>. In this paper, we say a static analysis A is as precise as a static analysis B if A accepts every secure program that is accepted by B. Moreover, we say A is (strictly) more precise than B if A is as precise as B, and A accepts at least one secure program that is rejected by B. d 2 with level 2 , the policy allows information flow from d 1 to d 2 if and only if 1 2 . In this paper, we use two distinguished security levels S (Secret) and P (Public) for simplicity, but keep in mind that the proposed theory is general enough to express richer security levels. The security policy on the levels P and S is defined as P S, while S P. That is, information flow from public data to secret variable is allowed, while the other direction is forbidden. Hereafter, we assume variable s is labeled as S, and variable p is labeled as P unless specified otherwise.</p><p>Explicit and Implicit Flows: An information flow analysis prohibits any explicit or implicit information flow that is inconsistent with the given policy. Explicit flows take place when confidential data are passed directly to public variables, such as the command p := s, while implicit flows arise from the control structure of the program. For example, the following program has an implicit flow:</p><p>if (s = 0) then p := 0 else p := 1 Assume the secret variable s is either 0 or 1. This code is insecure since it is functionally equivalent to p := s. That is, the confidential data s is copied to a public variable p.</p><p>An information flow security system rules out all explicit and implicit flows; any violation of a given security policy results in an error. As in most information flow analyses, we do not consider timing, termination and other side channels in this paper; controlling side channel leakage (e.g., <ref type="bibr">[1]</ref>, <ref type="bibr">[28]</ref>, <ref type="bibr">[44]</ref>) is largely an orthogonal issue.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>B. Sources of Imprecision</head><p>Most information flow analyses provide soundness (i.e., if the analysis determines that a program is secure, then the program provably prevents disclosure of sensitive data). However, since the problem of checking information flow security is in general undecidable <ref type="bibr">[38]</ref>, one key challenge of designing an information flow analysis is to maintain soundness, while improving precision (i.e., reject fewer secure programs).</p><p>In this section, we introduce the major sources of imprecision in existing type systems. In the next section (Section II-C), we illustrate how does our novel information flow analysis alleviate those sources of imprecision.</p><p>Flow-Insensitivity: The first source of imprecision is flow-insensitivity, meaning that the order of execution is not taken into account in a program analysis <ref type="bibr">[35]</ref>. In the context of information flow analysis, the intuition is that an analysis is flow-insensitive if a program is analyzed as secure only when every subprogram is analyzed as secure <ref type="bibr">[26]</ref>.</p><p>Many security type systems, including <ref type="bibr">[33]</ref>, <ref type="bibr">[36]</ref>, <ref type="bibr">[41]</ref>, are flow-insensitive. Consider the program in Figure <ref type="figure">1(a)</ref> (for now, ignore the brackets). This program is secure since the public variable p has a final value zero regardless of the secret variable s. However, it is considered insecure by a flow-insensitive analysis because of the insecure subprogram  (x := s; p := x; ). Under the hood, the imprecision arises since the analysis requires fixed levels: the security level of a variable must remain the same throughout the program execution. But in this example, these is no fixed-level for the variable x: when the level is S, p := x is insecure; when the level is P, x := s is insecure.</p><p>Path-Insensitivity: The second source of imprecision is path-insensitivity, meaning that the predicates at conditional branches are ignored in a program analysis <ref type="bibr">[35]</ref>. In the context of information flow analysis, the intuition is that an analysis is path-insensitive if a program is analyzed as secure only when every sequential program generated from one combination of branch outcomes is analyzed as secure.</p><p>For instance, the flow-sensitive type system in <ref type="bibr">[26]</ref> is path-insensitive; consequently, it rejects the secure program shown in Figure <ref type="figure">1</ref>(c) (due to Le Guernic and Jensen <ref type="bibr">[29]</ref>). This example is secure since the value of the secret variable s never flows to the public variable p 2 , since the assignments y := s and x := y never execute together in the same program execution. However, the type system in <ref type="bibr">[26]</ref> rejects this program because it lacks the knowledge that the two if-statements cannot take the "then" branch in the same execution. Hence, it has to conservatively analyze the security of an impossible program execution: x := 0; y := 0; y := s; x := y; p := x, which is insecure due to an explicit flow from s to p.</p><p>Under the hood, we observe that the imprecision arises from the fact that a path-insensitive analysis (e.g., <ref type="bibr">[26]</ref>) requires that the security levels of a variable on two paths to be "merged" (as the least upper bound) after a branch. Consider the first branch in Figure <ref type="figure">1(c)</ref>. The "then" branch requires y to be S due to the flow from s to y. So after that if-statement, the label of y must be S (i.e., which path is taken is unknown to the rest of the program). Similarly, x has label S after the second if-statement. Hence, p 2 := x is rejected due to an explicit flow from S to P.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>C. Overview</head><p>In order to alleviate analysis imprecision due to flowand path-insensitivity, our novel information flow analysis has two major components: a program transformation that enables flow-sensitivity and a type system with dependent security labels, which enables path-sensitivity.</p><p>1) Program Transformation: Consider the example in Figure <ref type="figure">1</ref>(a) (for now, ignore the brackets). A fixed-level type system rejects this program since the levels of x at line 1 and 3 are inconsistent. We observe that there are indeed two copies of x in this program but only the final one (defined at line 2) is released. So without modifying a type system, we can explicitly transform the source program to a semantically equivalent one that explicitly marks different copies.</p><p>The source language of our program analysis (Section III) provides a tunable knob for improved precision: a bracketed assignment in the form of x := e . Such an assignment is semantically identical to x := e but allows a programmer to request improved precision (the source language allows such flexibility since reduced precision might be preferred for reasons such as more efficient analysis on the program). In particular, for a bracketed assignment x := e , the program transformation (Section IV) generates a fresh copy for x and uses that copy in the rest of program until another new copy is generated. For example, given the bracketed assignment at line 2 of Figure <ref type="figure">1</ref>(a), the transformed program is shown in Figure <ref type="figure">1(b)</ref>, where the second definition of x and its use at line 3 are replaced with x 1 . The benefit is that the false dataflow dependency from s to p in the source program is eliminated. Hence, the transformed program can be accepted by a fixed-level type system, by assigning x and x 1 to levels S and P respectively. In general, we prove that (when all assignments are bracketed) the transformation enables a fixed-level system to be at least as precise as a classic flow-sensitive type system (Section VII).</p><p>2) Dependent Labels: Consider the example in Figure <ref type="figure">1(c</ref>). A path-insensitive type system rejects this program since such a type system ignores the path conditions under which assignments occur. Consequently, the security level of y is conservatively estimated as S after line 2, though when p 1 &#8805; 0, variable y only carries public information.</p><p>In our system, path-sensitivity is gained via dependent security labels (i.e., security labels that depend on program states). Compared with a security level drawn directly from a lattice, a dependent security label precisely tracks all possible security levels from different branches; hence, pathsensitivity is gained. Since dependent security labels are orthogonal to bracketed assignments, extra precision can be gained in our system even in the absence of bracketed assignments. For example, while the program in Figure <ref type="figure">1(c)</ref> can not be accepted using any simple security level for y, we can assign to y a dependent label (p 1 &lt; 0?S : P), which specifies an invariant that the level of y is S when p 1 &lt; 0 (i.e., the "then" branch is taken at line 2); the level is P otherwise. Such an invariant can be maintained by the type system described in Section V. For instance, to ensure that the explicit flow from y to x at line 3 is secure, the type system generates a proof obligation (p 1 &gt; 0 &#8658; (p 1 &lt; 0?S : P) P), meaning that the information flow from y to x must be permissible under the path condition p 1 &lt; 0. This proof obligation can easily be discharged by an external solver. The soundness of our type system (Section VI) guarantees that all security violations are detected at compile time.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>III. LANGUAGE SYNTAX AND SEMANTICS</head><p>In this paper, we consider a simple imperative WHILE language whose syntax and operational semantics are shown in Figures <ref type="figure">2</ref> and<ref type="figure">3</ref> respectively. The syntax and semantics are mostly standard: expressions e consist of variables x, integers n, and composed expressions e op e, where op is a binary arithmetic operation. Commands c consist of standard imperative instructions, including skip, sequential composition c 1 ; c 2 , assignments, conditional if branch and while loop. The semantics of expressions are given in the form of e, m &#8659; n (big-step semantics), where memory m maps variables to their values. The small-step semantics of commands has the form of c, m &#8594; c , m , where c, m is a configuration. We use m{x &#8594; n} to denote the memory that is identical to m except that variable x is updated to the new value n.</p><p>The only interesting case is the bracketed assignment x := e , which is semantically equivalent to normal assignment x := e in the source language. These commands are tunable knobs for improved precision in our information flow analysis, as we show shortly.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>IV. PROGRAM TRANSFORMATION</head><p>To alleviate the imprecision due to flow-insensitivity, one component of our analysis is a novel program transformation that introduces extra variable copies to the source program, so that false dataflow dependencies that otherwise may confuse flow-insensitive analyses are removed.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>A. Bracketed Assignments and the Transformed Program</head><p>We propose a general and flexible design for the program transformation. In particular, the program transformation is triggered only for assignments that are marked with brackets. Such a design enables a tunable control of analysis precision for programmers or high-level program analysis built on our meta source language: when there is no bracketed assignment, the transformed program is simply identical to the source program; when all assignments have brackets, the transformation generates a fresh copy of x for each bracketed assignment x := e .</p><p>Due to the nature of the transformation, the transformed program follows the same syntax and semantics as the source language, except that all bracketed assignments are removed.</p><p>To avoid confusion, we use underlined notations for the transformed program: e for expressions, c for commands and m for memories, when both the original and the transformed programs are in the context; otherwise, we simply use e, c and m for the transformed programs as well.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>B. Transformation Rules</head><p>The program transformation maintains one active copy for each variable in the source code. One invariant maintained by the transformation is that for each program point, there is exactly one active copy for each source-program variable. Intuitively, that unique active copy holds the most recent value of the corresponding source-program variable.</p><p>Definition 1 (Active Set): An active set A : Vars &#8594; Vars, is an injective function that maps a source variable to a unique variable in the transformed program.</p><p>For simplicity, we assume that the variables in the transformed program follow the naming convention of x i where x &#8712; Vars and i is an index. Hence, for any variable v in the range of A, we simply use v to denote its corresponding source variable (i.e., a variable without the index). Hence, v = A(v ) always holds by definition. Moreover, since we frequently refer to the range of A, we abuse the notation of A to denote active copies that A may map to (i.e., the range of A). That is, we simply write v &#8712; A instead of v &#8712; Ran(A) in this paper. Moreover, we use A{x &#8594; x i } to denote an active set that is identical to A except that x is mapped to x i .</p><p>The transformation rules are summarized in Figure <ref type="figure">4</ref>. For an expression e, the transformation has the form of e, A e, where e is the transformed expression. The transformation of an expression simply replaces the source variables with their active copies in A.</p><p>For a command c, the transformation has the form of c, A c, A , where c is the source command and c is the transformed one. Since assignments may update the active set, A represents the active set after c.</p><p>Rule (TRSF-Assign) applies to a normal assignment. It transforms the assignment to one with the same assignee and update A accordingly. Rule (TRSF-Assgin-Create) applies to a bracketed assignment x := e . It renames the assignee to a fresh variable. For example, line 1 of the transformed program in Figure <ref type="figure">1(b</ref>) is exactly the same as the original program in 1(a); but the assignee of line 2 is renamed to x 1 . Rule (TRSF-IF) uses a special &#934; function, defined in Figure <ref type="figure">5</ref>, to merge the active sets generated from the branches. In particular, &#934;(A 1 , A 2 )</p><p>A 3 generates an active set </p><p>while (e) c, A A 1 := A; while (e) (c;  A 3 that maps x to a fresh variable iff</p><p>Transformation for the while loop is a little tricky since we need to compute an active set that is active both before and after each iteration. Rule (TRSF-WHILE) shows one feasible approach: the rule transforms the loop in a way that A 1 is a fixed-point: the active set is always A 1 before and after an iteration by the transformation. We note that given an identity function as the initial active set A, a program without any bracketed assignment is transformed to itself with a final active set A. At the other extreme, the transformation generates one fresh active copy for each assignment when all assignments are bracketed.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>C. Correctness of the Transformation</head><p>One important property of the proposed transformation is its correctness: a transformed program is semantically equivalent to the source program. To formalize this property, we need to build an equivalence relation on the memory for the source program (m : Vars &#8594; N) and the memory for the transformed program (m : Vars &#8594; N). We note that the projection of m on an active set A defined as follows shares the same domain and range as m. Hence, it naturally specifies an equivalence relation on m and m w.r.t. A: m can be directly compared with m A .</p><p>Definition 2 (Memory Projection on Active Set): We use m A to denote the projection of m on the active set A, defined as follows:</p><p>We formalize the correctness of our transformation as the following theorem. As stated in the theorem, the correctness is not restricted to any particular initial active set A.</p><p>Theorem 1 (Correctness of Transformation): Any transformed program is semantically equivalent to its source:</p><p>Proof sketch. By induction on the transformation rules. The full proof is available in the full version of this paper <ref type="bibr">[30]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>D. Relation to Information Flow Analysis</head><p>Up to this point, it might be unclear why introducing extra variables can improve the precision of information flow analysis. We first note that transformed programs enable more precise reasoning for dataflows. Consider the program in Figure <ref type="figure">1</ref>(a) and Figure <ref type="figure">1(b</ref>). In the transformed program, it is clear that the value stored in x never flows to variable p; but such information is not obvious in the source program. Moreover, Theorem 1 naturally enables a more precise analysis of the transformed program, since it implies that if any property holds on the final active set A for the transformed program, then the property holds on the entire final memory for the original program. That is, in terms of information flow security, the original program leaks no information if the transformed program leaks no information in the subset A of the final memory. Consider the example in Figure <ref type="figure">1</ref>(b) again. Theorem 1 allows a program analysis to accept the (secure) program even though the variable x, which is not in A , may leak the secret value.</p><p>In Section VII, we show that, in general, the program transformation automatically makes a flow-insensitive type system (e.g., the Volpano, Smith and Irvine's system <ref type="bibr">[41]</ref> and the system in Section V) at least as precise as a classic flow-sensitive type system <ref type="bibr">[26]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>E. Relation to Single Static Assignment (SSA)</head><p>SSA <ref type="bibr">[17]</ref> is used in the compilation chain to improve and simplify dataflow analysis. Viewed in this way, it is not surprising that our program transformation shares some similarity with the standard SSA-transformation. However, our transformation is different from the latter in major ways:</p><p>&#8226; Most importantly, our transformation does not involve the distinguishing &#966;-functions of SSA. First of all, removing &#966;-functions simplifies the soundness proof, since the resulting target language syntax and semantics are completely standard. Moreover, it greatly simplifies information flow analysis on the transformed programs.</p><p>Intuitively, the reason is that in the standard SSA from, the &#966;-function is added after a branch (i.e., in the form of (if (e) then c 1 else c 2 ); x := &#966;(x 1 , x 2 )).</p><p>However, without a nontrivial program analysis for the &#966;-function, the path conditions under which x := x 1 and x := x 2 occur (needed for path-sensitivity) is lost in the transformed program. On the other hand, extra assignments are inserted under the corresponding branches in our transformation. The consequence is that the path information is immediately available for the analysis on the transformed program. We defer a more detailed discussion on this topic to Section V-F, after introducing our type system. &#8226; As discussed in Section IV-D, the final active set A generated from the transformation is crucial for enabling a more precise program analysis on the transformed program (intuitively, an information flow analysis may safely ignore variables not in A ); however, such information is lost in the standard SSA form. &#8226; Our general transformation offers a full spectrum of analysis precision: from adding no active copy to adding one copy for each assignment, but the standard SSA transformation only performs the latter.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>V. TYPE SYSTEM</head><p>The second component of the analysis is a sound type system with expressive dependent labels. The type system analyzes a transformed program along with the final active set; the type system ensures that the final values of the public variables in the final active set are not influenced by the initial values of secret variables.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>A. Overview</head><p>We first introduce the nonstandard features in the type system: dependent security labels and program predicates.</p><p>Return to the example in Figure <ref type="figure">1(c</ref>). We observe that this program is secure because: 1) y holds a secret value only when p 1 &lt; 0, and 2) the information flow from y to x at line 3 only occurs when p 1 &gt; 0. Accordingly, to gain path-sensitivity, two pieces of information are needed in the type system: 1) expressive security labels that may depend on program states, and 2) an estimation of program states that may reach a program point.</p><p>We note that such information can be gained by introducing dependent security labels and program predicates to the type system. For the example in Figure <ref type="figure">1(c)</ref>, the relation between the level of y and the value of x can be described as a concise dependent label (p 1 &lt; 0?S : P), meaning that the security level of x is S when p 1 &lt; 0; the level is P otherwise. Moreover, for precision, explicit and implicit flows should only be checked under program states that may reach the program point. In general, a predicate overestimates such states. For the example in Figure <ref type="figure">1</ref>(c), checking that the explicit flow from y to x is secure under any program state is too conservative, since it only occurs when p 1 &gt; 0. With a program predicate that p 1 &gt; 0 for the assignment x := y, the label of y can be precisely estimated as P. Note that   our analysis agrees with the definition of path-sensitivity: it understands that the two assignments y := s and x := y; never execute together in one execution. The example in Figure <ref type="figure">1(c</ref>) is accepted by our type system.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>B. Challenge: Statically Checking Implicit Declassification</head><p>Though designing a dependent security type system may seem simple at the first glance, handling mutable variables can be challenging. The implicit declassification problem, as defined in <ref type="bibr">[45]</ref>, occurs whenever the level of a variable changes to a less restrictive one, but its value remains the same. Consider the insecure program in Figure <ref type="figure">6</ref>(a), which is identical to the secure program in Figure <ref type="figure">1</ref>(c) except for line 4. This program is obviously insecure since the sequence y := s; p 1 := 1; x := y; p 2 := x; may be executed together. Compared with Figure <ref type="figure">1</ref>(c), the root cause of this program being insecure is that at line 4 (when p 1 is updated), y's new level P (according to the label p 1 &lt; 0?S : P) is no longer consistent with the value it holds.</p><p>The type systems in <ref type="bibr">[23]</ref>, <ref type="bibr">[45]</ref> resort to a run-time mechanism to tackle the implicit declassification problem. However, that also means that the type system might change the semantics of the program being analyzed. In this paper, we aim for a purely static solution.</p><p>Program Transformation and Implicit Declassification: Although the program transformation in Section IV is mainly designed for flow-sensitivity, we observe that it also helps to detect implicit declassification. Consider the example in Figure <ref type="figure">6</ref>(a) again, where the assignment at line 4 has brackets. The corresponding transformed program (Figure <ref type="figure">6(b)</ref>) does not have an implicit declassification problem since updating p 3 at line 4 does not change y's level, which depends on the value of p 1 , rather than p 3 . Moreover, the insecure program cannot be type-checked since both "then" branches might be executed together.</p><p>While adding extra variable copies helps in the previous example, it unfortunately does not eliminate the issue. The intuition is that even for a fully-bracketed program, variables modified in a loop might still be mutable (since the local variables defined in the loop might change in each iteration). Consider the program in 7(a). This program is insecure since it copies s to y in the first iteration, and copies y to p in the next iteration. When fully-bracketed, the loop body becomes  where the labels of y 1 and y 3 depend on x 2 . In this program, implicit declassification happens when x 2 is updated. One naive solution is to disallow mutable variables in a program. However, dependence on mutable variables does not necessarily break security. Consider the program in Figure <ref type="figure">7</ref>(b), which is identical to the previous example except that y is updated at line 8. In this program, y's level depends on the mutable variable x, but it is secure since the value of s never flows to the next iteration.</p><p>Our Solution: Our insight is that changing y's level at line 7 in Figure <ref type="figure">7</ref>(b) is secure since the value of y is not used in the future (in terms of dataflow analysis, y is dead after line 6). This observation motivates us to incorporate a customized liveness analysis (Section V-D) into the type system: an update to a variable x is allowed if no labels of the live variables at that program point depend on x.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>C. Type Syntax and Typing Environment</head><p>In our type system, types are extended with security labels, whose syntax is shown in Figure <ref type="figure">8</ref>. The simplest form of label &#964; is a concrete security level drawn from a security lattice L. Dependent labels, specifying levels that depend on run-time values, have the form of (e?&#964; 1 : &#964; 2 ), where e is an expression. Semantically, if e evaluates to a non-zero value, the dependent label evaluates to &#964; 1 , otherwise, &#964; 2 . A security label can also be the least upper bound, or the greatest lower bound of two labels.</p><p>We use &#915; to denote a typing environment, a function from program variables to security labels. The integration of dependent labels puts constraints on the typing environment &#915; to ensure soundness. In particular, we say &#915; is wellformed, denoted as &#915;, if: 1) no variable depends on a more restrictive variable, preventing leakage from labels; 2) there is no chain of dependency. These restrictions are formalized as follows, where FV(&#964; ) denotes the free variables in &#964; :</p><p>FV(&#915;(v))) </p><p>We note that the definition rules out self-dependence, since if x &#8712; FV(&#915;(x)), we have FV(&#915;(x)) = &#8709;. Contradiction.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>D. Predicates and Variable Liveness</head><p>Our type system is parameterized on two static program analyses: a predicate generator and a customized liveness analysis. Instead of embedding these analyses into our type system, we follow the modular design introduced in <ref type="bibr">[45]</ref> to decouple program analyses from the type system. Consequently, the soundness of the type system is only based on the correctness of those analyses, regardless of the efficiency or the precision of those analyses.</p><p>Predicate Generator: We assume a predicate generator that generates a (conservative) program predicate for each assignment &#951; in the transformed program, denoted as P(&#951;). A predicate generator is correct as long as each predicate is always true when the corresponding assignment is executed.</p><p>A variety of techniques, regarding the trade-offs between precision and complexity, can be used to generate predicates that describe the run-time state. For example, weakest preconditions <ref type="bibr">[21]</ref> or the linear propagation <ref type="bibr">[45]</ref> could be used. Our observation is that for path-sensitivity, only shallow knowledge containing branch conditions is good enough for our type system.</p><p>Liveness Analysis: Traditionally, a variable is defined as alive if its value will be read in the future. But in our type system, if a variable x is alive, then any free variable in the label of x should also be considered as alive, because the concrete level of x depends on those variables. Moreover, we assume at the end of a program, only the variables in the final active set are alive, due to Theorem 1.</p><p>The liveness analysis is defined in Figure <ref type="figure">9</ref>, where s denotes a program command, and final refers to the last command of the program being analyzed. Here, final is the initial state for the backward dataflow analysis. succ[s] returns the successors (as a set) of the command s. In the GEN set of an assignment x := e, both FV(e), and v&#8712;FV(e) FV(&#915;(v)), the free variables inside their labels, are included. Since we are analyzing the transformed program, the state of the final active set is crucial for precision. Therefore, the analysis also enforces that, at the end of the program, all active copies in A are alive. Other rules are standard for liveness analysis.</p><p>Interface to the Type System: We assume each assignment in the transformed language is associated with a unique identifier &#951;. We use &#8226;&#951; and &#951;&#8226; to denote the precise program points right before and after the assignment respectively. For example, P(&#8226;&#951;) represents the predicates right before statement &#951;, and &#321; A (&#951;&#8226;) denotes the alive set right after statement &#951; with initialization of A as the final live set.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>E. Typing Rules</head><p>The type system is formalized in Figure <ref type="figure">10</ref> and Figure <ref type="figure">11</ref>. Typing rules for expressions have the form of &#915; e : &#964; , where e is the expression being checked and &#964; is the label of e. The typing judgment of commands has the form of &#915;, pc c. Here, pc is the usual program-counter label <ref type="bibr">[38]</ref>, used to control implicit flows.</p><p>Most rules are standard, thanks to the modular design of our type system. The only interesting one is rule (T-ASSIGN). For an assignment x := &#951; e, this rule checks that both the explicit and implicit flows are allowed in the security lattice: &#964; pc &#915;(x). Note that since &#964; might be a dependent label that involves free program variables, the relation is technically the lifted version of the relation on the security lattice. Hence, the constraint &#964; pc &#915;(x) requires the label of x to be at least as restrictive as the label of current context pc and the label e under any program execution. For precision, the type system validates the partial ordering under the predicate P(&#8226;&#951;), the predicate that must hold for any execution that reaches the assignment.</p><p>Moreover, the assignment rule checks that for any variable in the liveness set after the assignment, its security label must not depend on x; otherwise, its label might be inconsistent with its value. As discussed in Section V-B, this check is required to rule out insecure implicit declassification.</p><p>At the top level, the type system collects proof obligations in the form of |= P &#8658; &#964; 1 &#964; 2 , where &#964; 1 and &#964; 2 are security labels, and P is a predicate. Such proof obligations can easily be discharged by theorem solvers, such as Z3 <ref type="bibr">[19]</ref>.</p><p>As an example, consider again the interesting examples in Figure <ref type="figure">7</ref>. In both programs, we can assign y to the dependent label (x%2 = 0?S : P), and assign x to the label P. From the liveness analysis, we know that the live sets right after line 7 are {x, y, s} and {x, p, s} for Figure <ref type="figure">7</ref>(a) and Figure <ref type="figure">7(b)</ref> respectively. Hence, the type system correctly rejects the insecure program in Figure <ref type="figure">7</ref>(a) since the check at line 7, &#8704;v &#8712; &#321; A (&#951;&#8226;). x &#8712; FV(&#915;(v)), fails. On the other hand, the check at line 7 succeeds for the program in Figure <ref type="figure">7</ref> </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>F. Program Transformation and Information Flow Analysis</head><p>We now discuss the benefits of the program transformation in Section IV for information flow analysis in details.</p><p>1) Simplifying Information Flow Analysis: As discussed in Section IV-E, our transformation does not involve the distinguishing &#966;-functions of SSA. Doing so simplifies information flow analysis on the transformed programs. We illustrate this using the following example, where y is expected to have the label (x = 1?P : S) afterwards.</p><p>if (x = 1) then y := 0 else y := s Our transformation yields the following program, which can be verified with labels y 1 : P, y 2 : S, y 3 : (x = 1?P : S).</p><p>if (x = 1) then (y 1 := 0; y 3 := y 1 ) else (y 2 := s; y 3 := y 2 ); In comparison, the standard SSA form is: (if (x = 1) then y 1 := 0 else y 2 := s; )y 3 := &#966;(y 1 , y 2 );</p><p>To verify this program, a type system would need at least a nontrivial typing rule for &#966;, which somehow "remembers" that y 3 := y 2 occurs only when x = 1. Even with such knowledge, the type of y 2 cannot simply be S, since otherwise, assigning y 2 to y 3 at &#966; is insecure. In fact, the labels required for verification are y 1 , y 2 , y 3 : (x = 1?S : P).</p><p>Similar complexity is also involved for the &#966;-functions inserted for loops: to precisely reason about information flow, the semantics and typing rules of &#966; also need to track the number of iterations.</p><p>2) Improving Analysis Precision: Precision-Wise, bracketed assignments improve analysis precision in two ways. First, as discussed in Section IV-D, they improve flowsensitivity by introducing new variable definitions. Second, they also improve path-sensitivity by enabling more accurate program predicates. Consider the following example.</p><p>x := -1; if (x &gt; 0) then y := S; else y := 1; x := -x ; if (x &gt; 0) then p := y; This program is secure since p becomes 1 regardless of the value of s. However, without the bracket shown, the type system rejects it since no such label &#964; y satisfies the constraints that (x &gt; 0) &#8658; (S &#964; y ) (arising from the first if) and (x &gt; 0) &#8658; (&#964; y P) (arising from the second if).</p><p>However, with the bracket, the last two lines become x 1 := -x; if (x 1 &gt; 0) then p := y; This program can be type-checked with y's label as (x &gt; 0?S : P) and a precise enough predicate generator, which generates x 1 = -x after the assignment x 1 := -x, because constraints (x &gt; 0) &#8658; (S &#964; y ) and (x 1 &gt; 0&#8743;x 1 = -x) &#8658; (&#964; y P) can be solved with y's label mentioned above.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>VI. SOUNDNESS</head><p>Central to our analysis is rigorous enforcement of a strong information security property. We formalize this property in this section and sketch a soundness proof. The complete proof is available in the full version of this paper <ref type="bibr">[30]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>A. Noninterference</head><p>Our formal definition of information flow security is based on noninterference <ref type="bibr">[24]</ref>. Informally, a program satisfies noninterference if an attacker cannot observe any difference between two program executions that only differ in their confidential inputs. This intuition can be naturally expressed by semantics models of program executions.</p><p>Since a security label may contain program variables, its concrete level cannot be determined statically in general. But it can always be evaluated under a concrete memory:</p><p>Definition 4: For a security label &#964; , we evaluate its concrete level under memory m as follows:</p><p>V(&#964;, m) = , where &#964;, m &#8659; Moreover, to simplify notation, we use T &#915; (x, m) to denote the concrete level of x under m and &#915; (i.e., T &#915; (x, m) = V(&#915;(x), m)).</p><p>To formally define noninterference in the presence of dependent labels, we first introduce an equivalence relation on memories. Intuitively, two memories are (&#915;, )-equivalent if all variables with a level below level agree on both their concrete levels and values.</p><p>Definition 5 ((&#915;, )-Equivalence): Given any concrete level and &#915;, we say two memories m 1 and m 2 are equivalent up to under &#915; (denoted by</p><p>It is straightforward to check that &#8776; &#915; is an equivalence relation on memories. Note that we require type of x be bounded by in m 2 whenever T &#915; (x, m 1 )</p><p>. The reason is to avoid label channels, where confidential data is leaked via the security level of a variable <ref type="bibr">[37]</ref>, <ref type="bibr">[45]</ref>.</p><p>Given initial labels &#915; on variables and final labels &#915; on variables, we can formalize noninterference as follows:</p><p>Definition 6 (Noninterference): We say a program c satisfies noninterference w.r.t. &#915;, &#915; if equivalent initial memories produce equivalent final memories:</p><p>The main theorem of this paper is the soundness of our analysis: informally, if the transformed program type-checks, then the original program satisfies noninterference. Since the type system applies to the transformed program, we first need to connect the types in the original and the transformed programs. To do that, we define the projection of types for the transformed program in a way similar to Definition 2:</p><p>Definition 7 (Projection of Types): Given an active set A and &#915;, types of variables in the transformed program, we use &#915; A to denote a mapping from Vars to &#964; as follows:</p><p>Formally, the soundness theorem states that if a program c under active set A (e.g., an identity function) is transformed to c and final active set A , and c is well-typed under the type system (parameterized on A ), then c satisfies noninterference w.r.t. &#915; A and &#915; A :</p><p>Theorem 2 (Soundness):</p><p>To approach a formal proof, we notice that by the cor-</p><p>Figure <ref type="figure">12</ref>: Soundness of original and transformed programs. rectness of the program transformation (Theorem 1), it is sufficient to show that the transformed program leaks no information on the subset A . Such connection is illustrated in Figure <ref type="figure">12</ref>. We formalize the soundness for the transformed program w.r.t. initial and final active sets as follows:</p><p>Theorem 3 (Soundness of Transformed Program):</p><p>Proof sketch. One challenge in the formal proof is that the equivalence relation &#8776; &#915; only holds on the active copies and it may break temporarily during the program execution. Consider the example in Figure <ref type="figure">7</ref>(b). During the first iteration of the loop body, y holds a secret value but its level is P right after line 8. Hence, the relation &#8776; &#915; may break at that point in the small-step evaluation starting from two memories that only differ in secrets. To tolerate such temporary violation of the &#8776; &#915; relation, we prove the soundness with a new semantics which enforces that the relation &#8776; &#915; holds for all variables, and the final values of variables in A agree with those in the standard semantics. The new semantics, called the erasure semantics is shown in Figure <ref type="figure">13</ref>. The semantics is parameterized on the final active set A . The only difference from the standard one is for assignments: the new assignment rule (ST-ERASE) sets variables that are not alive and whose types depend on x to be zero. It is easy to check that the erasure semantics agrees on the final value of the variables in A . Also, it removes the temporary violation of the equivalence relation by forcing value of y to be zero after line 7 of Figure <ref type="figure">7</ref>(b). The complete proof is available in the full version of this paper <ref type="bibr">[30]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>VII. ENABLING FLOW-SENSITIVITY WITH PROGRAM TRANSFORMATION</head><p>Recall that the dependent security type system (without program transformation) is flow-insensitive; yet, our program analysis is flow-sensitive with the novel program transformation in Section IV. In this section, we show that this is not a coincidence: the program transformation automatically makes a flow-insensitive type system (e.g., the Volpano, Smith and Irvine's system <ref type="bibr">[41]</ref> and the system in Section V) flow-sensitive.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>A. The Hunt and Sands System</head><p>In <ref type="bibr">[26]</ref>, Hunt and Sands define a classic flow-sensitive type system where the security level of a program variable may "float" in the program. In particular, Hunt and Sands (HS) judgments have the form of pc HS &#915;{c}&#915; , where &#915; and &#915; are intuitively the typing environments before and after executing c respectively.</p><p>Consider the program in Figure <ref type="figure">1</ref>(a). While a flowinsensitive type system rejects it, the HS system accepts it with the following typing environments: &#915;{x := s; }&#915;{x := 0; }&#915; {p := x; }&#915; where &#915; = {x &#8594; S, p &#8594; P} and &#915; = {x &#8594; P, p &#8594; P}.</p><p>The HS typing rules for commands are summarized in Figure <ref type="figure">14</ref>. We use HS to distinguish those judgments from the ones in our system. The interesting rules are rule (HS-IF) and rule (HS-WHILE): the former computes the type for each variable as the least upper bound of its labels in the two branches; the latter computes the least fixed-point of a monotone function (the while loop) on a finite lattice.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>B. Program Transformation and Flow-Sensitivity</head><p>We show that the program transformation in Section IV along with a type subsumes the HS system: for any program c that can be type-checked in the HS system, the transformed program of c (i.e., a fully-bracketed program) can be type-checked in a flowinsensitive type system. This result has at least two interesting consequences:</p><p>1) The program transformation removes the source of "flow-insensitivity"; a flow-insensitivity type system can be automatically upgraded to a flow-sensitive one.</p><p>2) The flow-and path-sensitive system in this paper strictly subsumes the HS system: any secure program accepted by the latter is accepted by the former, but not vise versa (e.g., the program in Figure <ref type="figure">1</ref>(c)). To construct types in the transformed program, we first introduce a few notations. Given a typing environment &#915; : Vars &#8594; &#964; for the original program and an active set A, we can straightforwardly construct a (minimal) typing environment, written &#915; A , whose projection on A is &#915;:</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Easy to check that (&#915;</head><p>Moreover, given a sequence of tying environments for the transformed program, say &#915; 1 , &#915; 2 , . . . , we define a merge function, denoted as &#8746;, that returns the union of &#915; 1 , &#915; 2 , . . . so that conflicts in the environments are resolved in the order of &#915; 1 , &#915; 2 , . . . . For example, &#8746;({x 1 &#8594; S, y 2 &#8594; P}, {x 1 &#8594; P, y 2 &#8594; P}) = {x 1 &#8594; S, y 2 &#8594; P}.</p><p>For a fully bracketed program c , we can inductively define the construction of &#915; as inference rules in the form of (pc, &#915;, A){ c c}(&#915; , A ) &#8594; &#915;</p><p>where pc, &#915;, c, &#915; are consistent with the HS typing rules in the form of pc HS &#915;{c}&#915; ; A, c , A , c are consistent with the program transformation rules in the form of c , A c, A . &#915; is the constructed typing environment that, as we show shortly in Theorem 4, satisfies &#915;, pc c. The construction algorithm is formalized in Figure <ref type="figure">15</ref>.</p><p>Most parts of the rules are straightforward; they are simply constructed to be consistent with the HS typing rules and the transformation rules in Figure <ref type="figure">4</ref>. The following lemma makes such connections explicit.</p><p>Lemma 1:</p><p>Proof: By induction on the structure of c. To construct types for the transformed program: for skip, we use &#915; A (the typing environment before this command); for assignment, since x i must be fresh, we can simply augment &#915; A with {x i &#8594; &#964; }. Other rules simply merge constructed types from subexpressions in a conflict-solving manner, using &#8746;. An eagle-eyed reader may find the construction is intuitively correct if there is no conflict at all in the merge operations.</p><p>We show that there is no conflict during construction by two observations. First, if a variable has the same active copy before and after transforming a fully-bracketed command c , then its type must remain the same (before and after c) in the HS system. This property is formalized as follows:</p><p>Lemma 2:</p><p>Proof sketch. By induction on the structure of c. The most interesting cases are for branch and loop. C-SKIP</p><p>&#915; HS e : &#964; e, A e by the induction hypothesis. Similarly, we can infer that</p><p>&#8226; while (e) c: By rule (TRSF-WHILE), we have c , A c 1 , A 1 , where A is A 1 in this case. Hence, by the assumption, we have A(v) = A 1 (v). By rule (HS-WHILE), there is a sequence of environments</p><p>Second, the constructed environment is minimal, meaning that it just specifies types for the variables in A and the freshly generated variables in c (denoted as FVars(c)).</p><p>For technical reasons, we formalize this property along with the main correctness theorem of the construction, stating that the transformed program c type-checks under the constructed environment &#915;. Note that given any A, a fully bracketed command c always transforms to some c and A . Hence, by Lemma 1, the following theorem is sufficient to show that our program analysis is at least as precise as the HS system: Theorem 4:</p><p>Proof: Complete proof is available in the full version of this paper <ref type="bibr">[30]</ref>.</p><p>An interesting corollary of Theorem 4 is that the transformed program can be type-checked under the classic fixedlevel system in <ref type="bibr">[41]</ref> as well.</p><p>Corollary 1: Theorem 4 also applies to the type system in Figure <ref type="figure">10</ref> and Figure <ref type="figure">11</ref> with the restriction that all labels are security levels (i.e., non-dependent labels), which is identical to the system in <ref type="bibr">[41]</ref>.</p><p>Proof: We note that the construction in Figure <ref type="figure">15</ref> only uses the non-dependent part of our type system. Given all labels are security levels, it is straightforward to check that our type system degenerates to the system in <ref type="bibr">[41]</ref>.</p><p>Theorem 4 has a strong prerequisite that all assignments in the original program are bracketed. We note that the result remains true when such prerequisite is relaxed. Intuitively, a bracket is unnecessary when the old and new definitions have the same security label. Otherwise, a bracket is needed for flow-sensitivity. For example, to gain flow-sensitivity, only the second assignment in Figure <ref type="figure">1</ref>(a) needs a bracket. The strong prerequisite is used in Theorem 4 to make the result general (i.e., type-agnostic).</p><p>According to Corollary 1, the result that any secure program accepted by the HS system is accepted by our analysis is true even if all dependent security labels degenerate to simple security levels. On the other hand, introducing dependent security labels makes our analysis strictly more precise than the HS system. For example, the program in Figure <ref type="figure">1(c</ref>) cannot be verified without dependent security labels, but it can be type-checked with a label y : (p 1 &lt; 0?S : P).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>C. Comparison with the transformation in [26]</head><p>Hunt and Sands show that if a program can be typechecked in the HS system, then there is an equivalent program which can be type-checked by a fixed level system <ref type="bibr">[26]</ref>. However, their construction of the equivalent program is type guided, meaning that the program transformation assumes that security labels have already been obtained in the HS system, while our program transformation (Figure <ref type="figure">4</ref>) is general and syntax-directed. An interesting application of our transformation is to test the typeability of the HS system without obtaining the types needed in the HS system in the first place.</p><p>It is noteworthy that our transformation is arguably simpler than the HS transformation since our rule for loop has no fixed-point construction while the latter has one. The reason is that compared with the HS transformation, the goal of our transformation is easier to achieve: our transformation improves analysis precision, while the HS transformation infers the type for each variable in a program. For example, consider a loop with only one assignment x := x + 1, and x is initially P. In the HS system, the transformed program is x P := x P + 1, where x P is the public version of variable x. On the other hand, our transformation generates x 1 := x 2 ; x 2 := x 1 + 1. From the perspective of inferring labels, introducing x 1 and x 2 might seem unnecessary since they must have the same label according to the type system. However, doing so might improve analysis precision (e.g., the type system can specify the dependencies on x 1 and x 2 separately with two copies of x).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>VIII. RELATED WORK</head><p>We refer to <ref type="bibr">[38]</ref> for a comprehensive survey of static information flow analysis. Here, we focus on the most relevant ones.</p><p>Dependent Labels and Information Flow Security: Dependent types have been widely studied and have been applied to practical programming languages (e.g., <ref type="bibr">[7]</ref>, <ref type="bibr">[16]</ref>, <ref type="bibr">[32]</ref>, <ref type="bibr">[33]</ref>, <ref type="bibr">[42]</ref>, <ref type="bibr">[43]</ref>). New challenges emerge for information flow analysis, such as precise, sound handling of information channels arising from label changes.</p><p>For security type systems, the most related works are SecVerilog <ref type="bibr">[23]</ref>, <ref type="bibr">[45]</ref>, Lourenc &#184;o and Caires <ref type="bibr">[31]</ref> and Murray et al. <ref type="bibr">[32]</ref>. SecVerilog is a Verilog-like language with dependent security labels for verifying timing-sensitive noninterference in hardware designs. The type systems in <ref type="bibr">[23]</ref>, <ref type="bibr">[45]</ref> are not purely static: they remove implicit declassification by a run-time enforcement that modifies program semantics. A recent extension to SecVerilog <ref type="bibr">[22]</ref> alleviates such limitation by hardware-specific static reasoning. However, those type systems do not handle loops (absent in hardware description languages), which gives rise to new challenges for soundness. Moreover, they are not flowsensitive. The recent work <ref type="bibr">[31]</ref> also allows the security type to depend on runtime values. However, the system is flow-insensitive, and it does not have a modular design that allows tunable precision. Moreover, the language has limited expressiveness: it has no support for recursion, and it disallows dependence on mutable variables. Exploring dependent labels to their full extent exposes new challenges that we tackle in this work, such as implicit declassification. Murray et al. <ref type="bibr">[32]</ref> present a flow-sensitive dependent security type system for shared-memory programs. The type system enforces a stronger security property: timing-sensitive noninterference for concurrent programs. However, even when the extra complexity due to concurrency and timing sensitivity are factored out, extra precision in their system is achieved via a floating type system that tracks the typing environments and program states throughout the program. In comparison, our analysis achieves flow-sensitivity via a separate program transformation, which results in an arguably simpler type system. Moreover, for dependency on mutable variables, their system only allows a variable's security level to upgrade to a higher one, while our system allows a downgrade to a lower level when doing so is secure.</p><p>Some prior type systems for information flow also support limited forms of dependent labels <ref type="bibr">[25]</ref>, <ref type="bibr">[27]</ref>, <ref type="bibr">[33]</ref>, <ref type="bibr">[39]</ref>, <ref type="bibr">[40]</ref>, <ref type="bibr">[46]</ref>. The dependence on run-time program state, though, is absent in most of these, and most of them are flow-and path-insensitive.</p><p>Flow-Sensitive Information Flow Analysis: Flowsensitive information flow control <ref type="bibr">[8]</ref>, <ref type="bibr">[26]</ref>, <ref type="bibr">[37]</ref> allows security labels to change over the course of computation. Those systems rely on a floating type system or a run-time monitor to track the security labels at each program point. On the other hand, the program transformation in our paper eliminates analysis imprecision due to flow-insensitivity. Moreover, the bracketed assignments in a source program provide tunable control for needed analysis precision. These features offer better flexibility and make it possible to turn a flow-insensitive analysis to be flow-sensitive.</p><p>Semantic-Based Information Flow Analysis: Another direction of information flow security is to verify the semantic definition of noninterference based on program logics. The first work that used a Hoare-style semantics to reason about information flow is by Andrews and Reitman <ref type="bibr">[5]</ref>. Independence analysis based on customized logics <ref type="bibr">[2]</ref>- <ref type="bibr">[4]</ref> was proposed to check whether two variables are independent or not. Self-Composition <ref type="bibr">[9]</ref>, <ref type="bibr">[18]</ref> composes a program with a copy of itself, where all variables are renamed. The insight is that noninterference of a program P can be reduced to a safety property for the self-composition form of P .</p><p>Relational Hoare Logic <ref type="bibr">[13]</ref> was first introduced for a core imperative program to reason about the relation of two program executions. It was later extended to verify security proofs of cryptographic constructions <ref type="bibr">[11]</ref> and differential privacy of randomized algorithms <ref type="bibr">[10]</ref>, <ref type="bibr">[12]</ref>. In the context of information flow security, Relational Hoare Type Theory <ref type="bibr">[34]</ref> extends Hoare Type Theory and has been used to reason about advanced information flow policies.</p><p>Though some semantic-based information flow analyses are flow-and path-sensitive, most mechanisms incur heavy annotation burden and steep learning curve on programmers. We believe our approach shows that it is not necessary to resort to those heavyweight methods to achieve both flowand path-sensitivity.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>IX. CONCLUSIONS AND FUTURE WORK</head><p>This paper presents a sound yet flow-and path-sensitive information flow analysis. The proposed analysis consists of a novel program transformation as well as a dependent security type system that rigorously controls information flow. We show that our analysis is both flow-and pathsensitive. Compared with existing work, we show that our analysis is strictly more precise than a classic flow-sensitive type system, and it tackles the tricky implicit declassification issue completely at the compile time. Moreover, the novel design of our analysis allows a user to control the analysis precision as desired. We believe our analysis offers a lightweight approach to static information flow analysis along with improved precision.</p><p>The proposed analysis alleviates analysis imprecision due to data-and path-sensitivity, but it still may suffer from other sources of imprecision, such as the presence of insecure dead code and false control-flow dependency. For example, consider the secure program in Figure <ref type="figure">16</ref> (simplified from an example in <ref type="bibr">[14]</ref>) with security labels s : S, p : P. In this example, although x is updated under a confidential branch condition, both branches result in the same state where x = 1; thus, the outcome of p is independent of the value of s. However, our analysis rejects this program since rule (T-ASSIGN) conservatively assumes that any public variable modified in a confidential branch would leak information. Motivated by the type system in <ref type="bibr">[14]</ref>, a promising direction that we plan to investigate is to incorporate sophisticated static program analyses so that the implicit flows can be ignored for the variables whose values are independent of branch outcomes. Additionally, hybrid information flow monitors (e.g., <ref type="bibr">[6]</ref>, <ref type="bibr">[14]</ref>, <ref type="bibr">[37]</ref>) are shown to be more precise than static flow-sensitive type systems. We plan to compare the analysis precision with those systems in our future work.</p></div><note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_0"><p>&#8226; if (e) then c 1 else c 2 : by the HS typing rule, pc HS &#915;{c1 }&#915; 1 &#8743; pc HS &#915;{c 2 }&#915; 2 &#8743; &#915; = &#915; 1 &#915; 2 . By the transformation rules, c i , A c i , A i , i &#8712; {1, 2}. Suppose A(v) = A 1 (v), A 1 (v)must be a fresh variable generated in c 1 , and hence, cannot be in A 2 . By the definition of &#934;, A 3 (v) must be fresh. This contradicts the assumption A(v) = A (v). Hence, &#915;(v) = &#915; 1 (v)</p></note>
		</body>
		</text>
</TEI>
