Lazy evaluation is a powerful tool for functional programmers. It enables the concise expression of on-demand computation and a form of compositionality not available under other evaluation strategies. However, the stateful nature of lazy evaluation makes it hard to analyze a program's computational cost, either informally or formally. In this work, we present a novel and simple framework for formally reasoning about lazy computation costs based on a recent model of lazy evaluation: clairvoyant call-by-value. The key feature of our framework is its simplicity, as expressed by our definition of the clairvoyance monad. This monad is both simple to define (around 20 lines of Coq) and simple to reason about. We show that this monad can be effectively used to mechanically reason about the computational cost of lazy functional programs written in Coq.
more »
« less
Faster, Simpler Red-Black Trees
For more than two decades, functional programmers have refined the persistent red-black tree—a data structure of unrivaled elegance. This paper presents another step in its evolution. Using a monad to communicate balancing information yields a fast insertion procedure, without sacrificing clarity. Employing the same monad plus a new decomposition simplifies the deletion procedure, without sacrificing efficiency.
more »
« less
- Award ID(s):
- 2116372
- PAR ID:
- 10538235
- Editor(s):
- Chang, Stephen
- Publisher / Repository:
- https://link.springer.com/chapter/10.1007/978-3-031-38938-2_3
- Date Published:
- ISBN:
- 978-3-031-38937-5
- Format(s):
- Medium: X
- Location:
- Springer Verlag
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Semidefinite programming (SDP) is a powerful tool for tackling a wide range of computationally hard problems such as clustering. Despite the high accuracy, semidefinite programs are often too slow in practice with poor scalability on large (or even moderate) datasets. In this paper, we introduce a linear time complexity algorithm for approximating an SDP relaxed K-means clustering. The proposed sketch-and-lift (SL) approach solves an SDP on a subsampled dataset and then propagates the solution to all data points by a nearest-centroid rounding procedure. It is shown that the SL approach enjoys a similar exact recovery threshold as the K-means SDP on the full dataset, which is known to be information-theoretically tight under the Gaussian mixture model. The SL method can be made adaptive with enhanced theoretic properties when the cluster sizes are unbalanced. Our simulation experiments demonstrate that the statistical accuracy of the proposed method outperforms state-of-the-art fast clustering algorithms without sacrificing too much computational efficiency, and is comparable to the original K-means SDP with substantially reduced runtime.more » « less
-
An adjunction is a pair of functors related by a pair of natural transformations, and relating a pair of categories. It displays how a structure, or a concept, projects from each category to the other, and back. Adjunctions are the common denominator of Galois connections, representation theories, spectra, and generalized quantifiers. We call an adjunction nuclear when its categories determine each other. We show that every adjunction can be resolved into a nuclear adjunction. This resolution is idempotent in a strong sense. The nucleus of an adjunction displays its conceptual core, just as the singular value decomposition of an adjoint pair of linear operators displays their canonical bases. The two composites of an adjoint pair of functors induce a monad and a comonad. Monads and comonads generalize the closure and the interior operators from topology, or modalities from logic, while providing a saturated view of algebraic structures and compositions on one side, and of coalgebraic dynamics and decompositions on the other. They are resolved back into adjunctions over the induced categories of algebras and of coalgebras. The nucleus of an adjunction is an adjunction between the induced categories of algebras and coalgebras. It provides new presentations for both, revealing the meaning of constructing algebras for a comonad and coalgebras for a monad. In his seminal early work, Ross Street described an adjunction between monads and comonads in 2-categories. Lifting the nucleus construction, we show that the resulting Street monad on monads is strongly idempotent, and extracts the nucleus of a monad. A dual treatment achieves the same for comonads. Applying a notable fragment of pure 2-category theory on an acute practical problem of data analysis thus led to new theoretical result.more » « less
-
This paper presents λ-amor, a new type-theoretic framework for amortized cost analysis of higher-order functional programs and shows that existing type systems for cost analysis can be embedded in it. λ-amor introduces a new modal type for representing potentials – costs that have been accounted for, but not yet incurred, which are central to amortized analysis. Additionally, λ-amor relies on standard type-theoretic concepts like affineness, refinement types and an indexed cost monad. λ-amor is proved sound using a rather simple logical relation. We embed two existing type systems for cost analysis in λ-amor showing that, despite its simplicity, λ-amor can simulate cost analysis for different evaluation strategies (call-by-name and call-by-value), in different styles (effect-based and coeffect-based), and with or without amortization. One of the embeddings also implies that λ-amor is relatively complete for all terminating PCF programs.more » « less
-
Image datasets in specialized fields of science, such as biomedicine, are typically smaller than traditional machine learning datasets. As such, they present a problem for training many models. To address this challenge, researchers often attempt to incorporate priors, i.e., external knowledge, to help the learning procedure. Geometric priors, for example, offer to restrict the learning process to the manifold to which the data belong. However, learning on manifolds is sometimes computationally intensive to the point of being prohibitive. Here, we ask a provocative question: is machine learning on manifolds really more accurate than its linear counterpart to the extent that it is worth sacrificing significant speedup in computation? We answer this question through an extensive theoretical and experimental study of one of the most common learning methods for manifold-valued data: geodesic regression.more » « less
An official website of the United States government

