This article presents a typebased analysis for deriving upper bounds
on the expected execution cost of probabilistic programs.
The analysis is naturally compositional, parametric in the cost
model, and supports higherorder functions and inductive data types.
The derived bounds are multivariate polynomials that are functions of
data structures.
Bound inference is enabled by local type rules that reduce type
inference to linear constraint solving.
The type system is based on the potential method of amortized analysis
and extends automatic amortized resource
analysis (AARA) for deterministic programs.
A main innovation is that bounds can contain symbolic probabilities,
which may appear in data structures and function arguments.
Another contribution is a novel soundness proof that establishes the
correctness of the derived bounds with respect to a distributionbased
operational cost semantics that also includes nontrivial diverging
behavior.
For cost models like time, derived bounds imply termination with
probability one.
To highlight the novel ideas, the presentation focuses on linear
potential and a core language.
However, the analysis is implemented as an extension of Resource Aware
ML and supports polynomial bounds and user defined data structures.
The effectiveness of the technique is evaluated by analyzing the
sample complexity of discrete distributions
and with a novel averagecase estimation for deterministic programs
that combines expected cost analysis with statistical methods.
more »
« less
Amortized Analysis via Coinduction
Amortized analysis is a program cost analysis technique for data structures in which the cost of operations is specified in aggregate, under the assumption of continued sequential use. Typically, amortized analyses are presented inductively, in terms of finite sequences of operations. We give an alternative coinductive formulation and prove that it is equivalent to the standard inductive definition. We describe a classic amortized data structure, the batched queue, and outline a coinductive proof of its amortized efficiency in calf, a dependent type theory for cost analysis.
more »
« less
 Award ID(s):
 1901381
 NSFPAR ID:
 10442702
 Date Published:
 Journal Name:
 10th Conference on Algebra and Coalgebra in Computer Science
 Format(s):
 Medium: X
 Sponsoring Org:
 National Science Foundation
More Like this


null (Ed.)This paper presents λamor, a new typetheoretic framework for amortized cost analysis of higherorder functional programs and shows that existing type systems for cost analysis can be embedded in it. λamor introduces a new modal type for representing potentials – costs that have been accounted for, but not yet incurred, which are central to amortized analysis. Additionally, λamor relies on standard typetheoretic concepts like affineness, refinement types and an indexed cost monad. λamor is proved sound using a rather simple logical relation. We embed two existing type systems for cost analysis in λamor showing that, despite its simplicity, λamor can simulate cost analysis for different evaluation strategies (callbyname and callbyvalue), in different styles (effectbased and coeffectbased), and with or without amortization. One of the embeddings also implies that λamor is relatively complete for all terminating PCF programs.more » « less

Bufferandflush is a technique for transforming standard externalmemory search trees into writeoptimized search trees. In exchange for faster amortized insertions, bufferandflush can sometimes significantly increase the latency of operations by causing cascades of flushes. In this paper, we show that flushing cascades are not a fundamental consequence of the bufferflushing technique, and can be removed entirely using randomization techniques. The underlying implementation of buffer flushing relies on a buffereviction strategy at each node in the tree. The ability for the user to select the buffer eviction strategy based on the workload has been shown to be important for performance, both in theory and in practice. In order to support arbitrary buffereviction strategies, we introduce the notion of a universal flush, which uses a universal eviction policy that can simulate any other eviction policy. This abstracts away the underlying eviction strategy, even allowing for workloadspecific strategies that change dynamically. Our deamortization preserves the amortized throughput of the underlying flushing strategy on all workloads. In particular, with our deamortization and a node cache of size polylogarithmic in the number of insertions performed on the tree, the amortized insertion cost matches the lower bound of Brodal and Fagerberg. For typical parameters, the lower bound is less than 1 I/O per insertion. For such parameters, our worstcase insertion cost is O(1) I/Os.more » « less

We present calf , a c ost a ware l ogical f ramework for studying quantitative aspects of functional programs. Taking inspiration from recent work that reconstructs traditional aspects of programming languages in terms of a modal account of phase distinctions , we argue that the cost structure of programs motivates a phase distinction between intension and extension . Armed with this technology, we contribute a synthetic account of cost structure as a computational effect in which costaware programs enjoy an internal noninterference property: input/output behavior cannot depend on cost. As a fullspectrum dependent type theory, calf presents a unified language for programming and specification of both cost and behavior that can be integrated smoothly with existing mathematical libraries available in type theoretic proof assistants. We evaluate calf as a general framework for cost analysis by implementing two fundamental techniques for algorithm analysis: the method of recurrence relations and physicist’s method for amortized analysis . We deploy these techniques on a variety of case studies: we prove a tight, closed bound for Euclid’s algorithm, verify the amortized complexity of batched queues, and derive tight, closed bounds for the sequential and parallel complexity of merge sort, all fully mechanized in the Agda proof assistant. Lastly we substantiate the soundness of quantitative reasoning in calf by means of a model construction.more » « less

There are two approaches to automatically deriving symbolic worstcase resource bounds for programs: static analysis of the source code and datadriven analysis of cost measurements obtained by running the program. Static resource analysis is usually sound but incomplete. Datadriven analysis can always return a result, but its lack of robustness often leads to unsound results. This paper presents the design, implementation, and empirical evaluation of hybrid resource bound analyses that tightly integrate static analysis and datadriven analysis. The static analysis part builds on automatic amortized resource analysis (AARA), a stateoftheart typebased resource analysis method that performs cost bound inference using linear optimization. The datadriven part is rooted in novel Bayesian modeling and inference techniques that improve upon previous datadriven analysis methods by reporting an entire probability distribution over likely resource cost bounds. A key innovation is a new type inference system calledmore » « less
Hybrid AARA that coherently integrates Bayesian inference into conventional AARA, combining the strengths of both approaches. Hybrid AARA is proven to be statistically sound under standard assumptions on the runtime cost data. An experimental evaluation on a challenging set of benchmarks shows that Hybrid AARA (i) effectively mitigates the incompleteness of purely static resource analysis; and (ii) is more accurate and robust than purely datadriven resource analysis.