This article presents a type-based analysis for deriving upper bounds on the expected execution cost of probabilistic programs. The analysis is naturally compositional, parametric in the cost model, and supports higher-order functions and inductive data types. The derived bounds are multivariate polynomials that are functions of data structures. Bound inference is enabled by local type rules that reduce type inference to linear constraint solving. The type system is based on the potential method of amortized analysis and extends automatic amortized resource analysis (AARA) for deterministic programs. A main innovation is that bounds can contain symbolic probabilities, which may appear in data structures and function arguments. Another contribution is a novel soundness proof that establishes the correctness of the derived bounds with respect to a distribution-based operational cost semantics that also includes nontrivial diverging behavior. For cost models like time, derived bounds imply termination with probability one. To highlight the novel ideas, the presentation focuses on linear potential and a core language. However, the analysis is implemented as an extension of Resource Aware ML and supports polynomial bounds and user defined data structures. The effectiveness of the technique is evaluated by analyzing the sample complexity of discrete distributions and with a novel average-case estimation for deterministic programs that combines expected cost analysis with statistical methods.
more »
« less
Amortized Analysis via Coinduction
Amortized analysis is a program cost analysis technique for data structures in which the cost of operations is specified in aggregate, under the assumption of continued sequential use. Typically, amortized analyses are presented inductively, in terms of finite sequences of operations. We give an alternative coinductive formulation and prove that it is equivalent to the standard inductive definition. We describe a classic amortized data structure, the batched queue, and outline a coinductive proof of its amortized efficiency in calf, a dependent type theory for cost analysis.
more »
« less
- Award ID(s):
- 1901381
- PAR ID:
- 10442702
- Date Published:
- Journal Name:
- 10th Conference on Algebra and Coalgebra in Computer Science
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
This paper presents λ-amor, a new type-theoretic framework for amortized cost analysis of higher-order functional programs and shows that existing type systems for cost analysis can be embedded in it. λ-amor introduces a new modal type for representing potentials – costs that have been accounted for, but not yet incurred, which are central to amortized analysis. Additionally, λ-amor relies on standard type-theoretic concepts like affineness, refinement types and an indexed cost monad. λ-amor is proved sound using a rather simple logical relation. We embed two existing type systems for cost analysis in λ-amor showing that, despite its simplicity, λ-amor can simulate cost analysis for different evaluation strategies (call-by-name and call-by-value), in different styles (effect-based and coeffect-based), and with or without amortization. One of the embeddings also implies that λ-amor is relatively complete for all terminating PCF programs.more » « less
-
Buffer-and-flush is a technique for transforming standard external-memory search trees into write-optimized search trees. In exchange for faster amortized insertions, buffer-and-flush can sometimes significantly increase the latency of operations by causing cascades of flushes. In this paper, we show that flushing cascades are not a fundamental consequence of the buffer-flushing technique, and can be removed entirely using randomization techniques. The underlying implementation of buffer flushing relies on a buffer-eviction strategy at each node in the tree. The ability for the user to select the buffer eviction strategy based on the workload has been shown to be important for performance, both in theory and in practice. In order to support arbitrary buffer-eviction strategies, we introduce the notion of a universal flush, which uses a universal eviction policy that can simulate any other eviction policy. This abstracts away the underlying eviction strategy, even allowing for workload-specific strategies that change dynamically. Our deamortization preserves the amortized throughput of the underlying flushing strategy on all workloads. In particular, with our deamortization and a node cache of size poly-logarithmic in the number of insertions performed on the tree, the amortized insertion cost matches the lower bound of Brodal and Fagerberg. For typical parameters, the lower bound is less than 1 I/O per insertion. For such parameters, our worst-case insertion cost is O(1) I/Os.more » « less
-
There are two approaches to automatically deriving symbolic worst-case resource bounds for programs: static analysis of the source code and data-driven analysis of cost measurements obtained by running the program. Static resource analysis is usually sound but incomplete. Data-driven analysis can always return a result, but its lack of robustness often leads to unsound results. This paper presents the design, implementation, and empirical evaluation of hybrid resource bound analyses that tightly integrate static analysis and data-driven analysis. The static analysis part builds on automatic amortized resource analysis (AARA), a state-of-the-art type-based resource analysis method that performs cost bound inference using linear optimization. The data-driven part is rooted in novel Bayesian modeling and inference techniques that improve upon previous data-driven analysis methods by reporting an entire probability distribution over likely resource cost bounds. A key innovation is a new type inference system calledHybrid AARAthat coherently integrates Bayesian inference into conventional AARA, combining the strengths of both approaches. Hybrid AARA is proven to be statistically sound under standard assumptions on the runtime cost data. An experimental evaluation on a challenging set of benchmarks shows that Hybrid AARA (i) effectively mitigates the incompleteness of purely static resource analysis; and (ii) is more accurate and robust than purely data-driven resource analysis.more » « less
-
We presentcalf, acost-awarelogicalframework for studying quantitative aspects of functional programs. Taking inspiration from recent work that reconstructs traditional aspects of programming languages in terms of a modal account ofphase distinctions, we argue that the cost structure of programs motivates a phase distinction betweenintensionandextension. Armed with this technology, we contribute a synthetic account of cost structure as a computational effect in which cost-aware programs enjoy an internal noninterference property: input/output behavior cannot depend on cost. As a full-spectrum dependent type theory,calfpresents a unified language for programming and specification of both cost and behavior that can be integrated smoothly with existing mathematical libraries available in type theoretic proof assistants. We evaluatecalfas a general framework for cost analysis by implementing two fundamental techniques for algorithm analysis: themethod of recurrence relationsandphysicist’s method for amortized analysis. We deploy these techniques on a variety of case studies: we prove a tight, closed bound for Euclid’s algorithm, verify the amortized complexity of batched queues, and derive tight, closed bounds for the sequential andparallelcomplexity of merge sort, all fully mechanized in the Agda proof assistant. Lastly we substantiate the soundness of quantitative reasoning incalfby means of a model construction.more » « less
An official website of the United States government

