skip to main content

Attention:

The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 11:00 PM ET on Friday, July 12 until 2:00 AM ET on Saturday, July 13 due to maintenance. We apologize for the inconvenience.


Title: Monadic and comonadic aspects of dependency analysis
Dependency analysis is vital to several applications in computer science. It lies at the essence of secure information flow analysis, binding-time analysis, etc. Various calculi have been proposed in the literature for analysing individual dependencies. Abadi et. al., by extending Moggi’s monadic metalanguage, unified several of these calculi into the Dependency Core Calculus (DCC). DCC has served as a foundational framework for dependency analysis for the last two decades. However, in spite of its success, DCC has its limitations. First, the monadic bind rule of the calculus is nonstandard and relies upon an auxiliary protection judgement. Second, being of a monadic nature, the calculus cannot capture dependency analyses that possess a comonadic nature, for example, the binding-time calculus, λ ∘ , of Davies. In this paper, we address these limitations by designing an alternative dependency calculus that is inspired by standard ideas from category theory. Our calculus is both monadic and comonadic in nature and subsumes both DCC and λ ∘ . Our construction explains the nonstandard bind rule and the protection judgement of DCC in terms of standard categorical concepts. It also leads to a novel technique for proving correctness of dependency analysis. We use this technique to present alternative proofs of correctness for DCC and λ ∘ .  more » « less
Award ID(s):
2006535 1703835
NSF-PAR ID:
10464096
Author(s) / Creator(s):
Date Published:
Journal Name:
Proceedings of the ACM on Programming Languages
Volume:
6
Issue:
OOPSLA2
ISSN:
2475-1421
Page Range / eLocation ID:
1320 to 1348
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Over twenty years ago, Abadi et al. established the Dependency Core Calculus (DCC) as a general purpose framework for analyzing dependency in typed programming languages. Since then, dependency analysis has shown many practical benefits to language design: its results can help users and compilers enforce security constraints, eliminate dead code, among other applications. In this work, we present a Dependent Dependency Calculus (DDC), which extends this general idea to the setting of a dependently-typed language. We use this calculus to track both run-time and compile-time irrelevance, enabling faster typechecking and program execution. 
    more » « less
  2. Abstract The research on gradual typing has led to many variations on the Gradually Typed Lambda Calculus (GTLC) of Siek & Taha (2006) and its underlying cast calculus. For example, Wadler and Findler (2009) added blame tracking, Siek et al . (2009) investigated alternate cast evaluation strategies, and Herman et al . (2010) replaced casts with coercions for space efficiency. The meta-theory for the GTLC has also expanded beyond type safety to include blame safety (Tobin-Hochstadt & Felleisen, 2006), space consumption (Herman et al ., 2010), and the gradual guarantees (Siek et al ., 2015). These results have been proven for some variations of the GTLC but not others. Furthermore, researchers continue to develop variations on the GTLC, but establishing all of the meta-theory for new variations is time-consuming. This article identifies abstractions that capture similarities between many cast calculi in the form of two parameterized cast calculi, one for the purposes of language specification and the other to guide space-efficient implementations. The article then develops reusable meta-theory for these two calculi, proving type safety, blame safety, the gradual guarantees, and space consumption. Finally, the article instantiates this meta-theory for eight cast calculi including five from the literature and three new calculi. All of these definitions and theorems, including the two parameterized calculi, the reusable meta-theory, and the eight instantiations, are mechanized in Agda making extensive use of module parameters and dependent records to define the abstractions. 
    more » « less
  3. Man-at-the-end (MATE) attacks against software programs are difficult to protect. Adversaries have complete access to the binary program and can run it under both static and dynamic analysis to find and break any software protection mechanisms put in place. Even though full-proof protection is not possible practically or theoretically, the goal of software protection should be to make it more difficult for an adversary to find program secrets by increasing either their monetary cost or time. Protection mechanisms must be easy to integrate into the software development lifecycle, or else they are of little to no use. In this paper, we evaluate the practical security of a watermarking technique known as Weaver, which is intended to support software watermarking based on a new transformation technique called executable steganography. Weaver allows hiding of identification marks directly into a program binary in a way that makes it difficult for an adversary to find and remove. We performed instruction frequency analysis on 106 programs from the GNU coreutils package to understand and define Weaver’s limitations and strengths as a watermarking technique. Our evaluation revealed that the initial prototype version of Weaver suffers from limitations in terms of standard benchmarks for steganography evaluation, such as its stealth. We found that this initial prototype of Weaver relied heavily on one type of instruction that does not frequently occur in standard programs, namely the mov instruction with an 8-byte immediate operand. Our instruction frequency analysis revealed a negative impact due to Weaver’s over-reliance on this mov instruction. 
    more » « less
  4. Man-at-the-end (MATE) attacks against software programs are difficult to protect. Adversaries have complete access to the binary program and can run it under both static and dynamic analysis to find and break any software protection mechanisms put in place. Even though full-proof protection is not possible practically or theoretically, the goal of software protection should be to make it more difficult for an adversary to find program secrets by increasing either their monetary cost or time. Protection mechanisms must be easy to integrate into the software development lifecycle, or else they are of little to no use. In this paper, we evaluate the practical security of a watermarking technique known as Weaver, which is intended to support software watermarking based on a new transformation technique called executable steganography. Weaver allows hiding of identification marks directly into a program binary in a way that makes it difficult for an adversary to find and remove. We performed instruction frequency analysis on 106 programs from the GNU coreutils package to understand and define Weaver’s limitations and strengths as a watermarking technique. Our evaluation revealed that the initial prototype version of Weaver suffers from limitations in terms of standard benchmarks for steganography evaluation, such as its stealth. We found that this initial prototype of Weaver relied heavily on one type of instruction that does not frequently occur in standard programs, namely the mov instruction with an 8-byte immediate operand. Our instruction frequency analysis revealed a negative impact due to Weaver’s over-reliance on this mov instruction. 
    more » « less
  5. To date, the most effective approach to compiling strict, higher-order functional languages (such as OCaml, Scheme, and SML) has been to use whole-program techniques to convert the program to a first-order monomorphic representation that can be optimized using traditional compilation techniques. This approach, popularized by MLton, has limitations, however. We are interested in exploring a different approach to compiling such languages, one that preserves the higher-order and polymorphic character of the program throughout optimization. To enable such an approach, we must have effective analyses that both provide precise information about higher-order programs and that scale to larger units of compilation. This paper describes one such analysis for determining the extent of variable bindings. We classify the extent of variables as either register (only one binding instance can be live at any time), stack (the lifetimes of binding instances obey a LIFO order), or heap (binding lifetimes are arbitrary). These extents naturally connect variables to the machine resources required to represent them. We believe that precise information about binding extents will enable efficient management of environments, which is a key problem in the efficient compilation of higher-order programs. At the core of the paper is the 3CPS intermediate representation, which is a factored CPS-based intermediate representation (IR) that statically marks variables to indicate their binding extent. We formally specify the management of this binding structure by means of a small-step operational semantics and define a static analysis that determines the extents of the variables in a program. We evaluate our analysis using a standard suite of SML benchmark programs. Our implementation gets surprisingly high yield and exhibits scalable performance. While this paper uses a CPS-based IR, the algorithm and results are easily transferable to other λ-calculus IRs, such as ANF. 
    more » « less