skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Fast Constraint Synthesis for C++ Function Templates
C++ templates are a powerful feature for generic programming and compile-time computations, but C++ compilers often emit overly verbose template error messages. Even short error messages often involve unnecessary and confusing implementation details, which are difficult for developers to read and understand. To address this problem, C++20 introduced constraints and concepts, which impose requirements on template parameters. The new features can define clearer interfaces for templates and can improve compiler diagnostics. However, manually specifying template constraints can still be non-trivial, which becomes even more challenging when working with legacy C++ projects or with frequent code changes. This paper bridges the gap and proposes an automatic approach to synthesizing constraints for C++ function templates. We utilize a lightweight static analysis to analyze the usage patterns within the template body and summarize them into constraints for each type parameter of the template. The analysis is inter-procedural and uses disjunctions of constraints to model function overloading. We have implemented our approach based on the Clang frontend and evaluated it on two C++ libraries chosen separately from two popular library sets: algorithm from the Standard Template Library (STL) and special functions from the Boost library, both of which extensively use templates. Our tool can process over 110k lines of C++ code in less than 1.5 seconds and synthesize non-trivial constraints for 30%-40% of the function templates. The constraints synthesized for algorithm align well with the standard documentation, and on average, the synthesized constraints can reduce error message lengths by 56.6% for algorithm and 63.8% for special functions.  more » « less
Award ID(s):
2114627 2237440
PAR ID:
10642077
Author(s) / Creator(s):
;
Publisher / Repository:
Association for Computing Machinery (ACM)
Date Published:
Journal Name:
Proceedings of the ACM on Programming Languages
Volume:
9
Issue:
OOPSLA1
ISSN:
2475-1421
Format(s):
Medium: X Size: p. 225-252
Size(s):
p. 225-252
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    A bstract There is a rich connection between classical error-correcting codes, Euclidean lattices, and chiral conformal field theories. Here we show that quantum error-correcting codes, those of the stabilizer type, are related to Lorentzian lattices and non-chiral CFTs. More specifically, real self-dual stabilizer codes can be associated with even self-dual Lorentzian lattices, and thus define Narain CFTs. We dub the resulting theories code CFTs and study their properties. T-duality transformations of a code CFT, at the level of the underlying code, reduce to code equivalences. By means of such equivalences, any stabilizer code can be reduced to a graph code. We can therefore represent code CFTs by graphs. We study code CFTs with small central charge c = n ≤ 12, and find many interesting examples. Among them is a non-chiral E 8 theory, which is based on the root lattice of E 8 understood as an even self-dual Lorentzian lattice. By analyzing all graphs with n ≤ 8 nodes we find many pairs and triples of physically distinct isospectral theories. We also construct numerous modular invariant functions satisfying all the basic properties expected of the CFT partition function, yet which are not partition functions of any known CFTs. We consider the ensemble average over all code theories, calculate the corresponding partition function, and discuss its possible holographic interpretation. The paper is written in a self-contained manner, and includes an extensive pedagogical introduction and many explicit examples. 
    more » « less
  2. null (Ed.)
    Differential privacy has become a de facto standard for releasing data in a privacy-preserving way. Creating a differentially private algorithm is a process that often starts with a noise-free (nonprivate) algorithm. The designer then decides where to add noise, and how much of it to add. This can be a non-trivial process – if not done carefully, the algorithm might either violate differential privacy or have low utility. In this paper, we present DPGen, a program synthesizer that takes in non-private code (without any noise) and automatically synthesizes its differentially private version (with carefully calibrated noise). Under the hood, DPGen uses novel algorithms to automatically generate a sketch program with candidate locations for noise, and then optimize privacy proof and noise scales simultaneously on the sketch program. Moreover, DPGen can synthesize sophisticated mechanisms that adaptively process queries until a specified privacy budget is exhausted. When evaluated on standard benchmarks, DPGen is able to generate differentially private mechanisms that optimize simple utility functions within 120 seconds. It is also powerful enough to synthesize adaptive privacy mechanisms. 
    more » « less
  3. Data movement between main memory and the CPU is a major bottleneck in parallel data-intensive applications. In response, researchers have proposed using compilers and intermediate representations (IRs) that apply optimizations such as loop fusion under existing high-level APIs such as NumPy and TensorFlow. Even though these techniques generally do not require changes to user applications, they require intrusive changes to the library itself: often, library developers must rewrite each function using a new IR. In this paper, we propose a new technique called split annotations (SAs) that enables key data movement optimizations over unmodified library functions. SAs only require developers to annotate functions and implement an API that specifies how to partition data in the library. The annotation and API describe how to enable cross-function data pipelining and parallelization, while respecting each function's correctness constraints. We implement a parallel runtime for SAs in a system called Mozart. We show that Mozart can accelerate workloads in libraries such as Intel MKL and Pandas by up to 15x, with no library modifications. Mozart also provides performance gains competitive with solutions that require rewriting libraries, and can sometimes outperform these systems by up to 2x by leveraging existing hand-optimized code. 
    more » « less
  4. Building on recent work in subregular syntax, we argue that syntactic constraints are best understood as operating not over trees, but rather strings that track structural relations such as dominance and c-command. Even constraints that seem intrinsically tied to trees (e.g. constraints on tree tiers) can be reduced to such strings. We define serial constraints as an abstraction that decomposes string constraints into a context function (which associates nodes with strings) and a requirement function (which enforces constraints on these strings). We provide a general procedure for implementing serial constraints as deterministic tree automata. The construction reveals that the many types of constraints found in subregular syntax are variants of the same computational template. Our findings open up a string-based perspective on syntactic constraints and provide a new, very general approach to the automata-theoretic study of subregular complexity. 
    more » « less
  5. It is well-known that GADTs do not admit standard map functions of the kind supported by ADTs and nested types. In addition, standard map functions are insufficient to distribute their data-changing argument functions over all of the structure present in elements of deep GADTs, even just deep ADTs or nested types. This paper develops an algorithm that characterizes those functions on a (deep) GADT’s type arguments that are mappable over its elements. The algorithm takes as input a term t whose type is an instance of a (deep) GADT G, and returns a set of constraints a function must satisfy to be mappable over t. This algorithm, and thus this paper, can in some sense be read as defining what it means for a function to be mappable over t: f is mappable over an element t of G precisely when it satisfies the constraints returned when our algorithm is run on t and G. This is significant: to our knowledge, there is no existing definition or other characterization of the intuitive notion of mappability for functions over GADTs. 
    more » « less