Interactive proofs of theorems often require auxiliary helper lemmas to prove the desired theorem. Existing approaches for automatically synthesizing helper lemmas fall into two broad categories. Some approaches are goaldirected, producing lemmas specifically to help a user make progress from a given proof state, but they have limited expressiveness in terms of the lemmas that can be produced. Other approaches are highly expressive, able to generate arbitrary lemmas from a given grammar, but they are completely undirected and hence not amenable to interactive usage. In this paper, we develop an approach to lemma synthesis that is both goaldirected and expressive. The key novelty is a technique for reducing lemma synthesis to a datadriven program synthesis problem, whereby examples for synthesis are generated from the current proof state. We also describe a technique to systematically introduce new variables for lemma synthesis, as well as techniques for filtering and ranking candidate lemmas for presentation to the user. We implement these ideas in a tool called lfind, which can be run as a Coq tactic. In an evaluation on four benchmark suites, lfind produces useful lemmas in 68% of the cases where a human prover used a lemma to make progress. In these cases lfind synthesizes a lemma that either enables a fully automated proof of the original goal or that matches the humanprovided lemma.
more »
« less
Datadriven lemma synthesis for interactive proofs
Interactive proofs of theorems often require auxiliary helper lemmas to prove the desired theorem. Existing approaches for automatically synthesizing helper lemmas fall into two broad categories. Some approaches are goaldirected, producing lemmas specifically to help a user make progress from a given proof state, but they have limited expressiveness in terms of the lemmas that can be produced. Other approaches are highly expressive, able to generate arbitrary lemmas from a given grammar, but they are completely undirected and hence not amenable to interactive usage. In this paper, we develop an approach to lemma synthesis that is both goaldirected and expressive. The key novelty is a technique for reducing lemma synthesis to a datadriven program synthesis problem, whereby examples for synthesis are generated from the current proof state. We also describe a technique to systematically introduce new variables for lemma synthesis, as well as techniques for filtering and ranking candidate lemmas for presentation to the user. We implement these ideas in a tool called lfind, which can be run as a Coq tactic. In an evaluation on four benchmark suites, lfind produces useful lemmas in 68% of the cases where a human prover used a lemma to make progress. In these cases lfind synthesizes a lemma that either enables a fully automated proof of the original goal or that matches the humanprovided lemma.
more »
« less
 NSFPAR ID:
 10441179
 Date Published:
 Journal Name:
 Proceedings of the ACM on programming languages
 Volume:
 6
 Issue:
 OOPSLA2
 ISSN:
 24751421
 Format(s):
 Medium: X
 Sponsoring Org:
 National Science Foundation
More Like this


Querytocommunication lifting theorems translate lower bounds on query complexity to lower bounds for the corresponding communication model. In this paper, we give a simplified proof of deterministic lifting (in both the treelike and daglike settings). Our proof uses elementary counting together with a novel connection to the sunflower lemma. In addition to a simplified proof, our approach opens up a new avenue of attack towards proving lifting theorems with improved gadget size  one of the main challenges in the area. Focusing on one of the most widely used gadgets  the index gadget  existing lifting techniques are known to require at least a quadratic gadget size. Our new approach combined with robust sunflower lemmas allows us to reduce the gadget size to near linear. We conjecture that it can be further improved to polylogarithmic, similar to the known bounds for the corresponding robust sunflower lemmas.more » « less

We propose a new conflictdriven program synthesis technique that is capable of learning from past mistakes. Given a spurious program that violates the desired specification, our synthesis algorithm identifies the root cause of the conflict and learns new lemmas that can prevent similar mistakes in the future. Specifically, we introduce the notion of equivalence modulo conflict and show how this idea can be used to learn useful lemmas that allow the synthesizer to prune large parts of the search space. We have implemented a general purpose CDCLstyle program synthesizer called Neo and evaluate it in two different application domains, namely data wrangling in R and functional programming over lists. Our experiments demonstrate the substantial benefits of conflict driven learning and show that Neo outperforms two stateoftheart synthesis tools, Morpheus and DeepCoder, that target these respective domainsmore » « less

Automated deductive program synthesis promises to generate executable programs from concise specifications, along with proofs of correctness that can be independently verified using thirdparty tools. However, an attempt to exercise this promise using existing proofcertification frameworks reveals significant discrepancies in how proof derivations are structured for two different purposes: program synthesis and program verification. These discrepancies make it difficult to use certified verifiers to validate synthesis results, forcing one to write an adhoc translation procedure from synthesis proofs to correctness proofs for each verification backend. In this work, we address this challenge in the context of the synthesis and verification of heapmanipulating programs. We present a technique for principled translation of deductive synthesis derivations (a.k.a. source proofs) into deductive target proofs about the synthesised programs in the logics of interactive program verifiers. We showcase our technique by implementing three different certifiers for programs generated via SuSLik, a Separation Logicbased tool for automated synthesis of programs with pointers, in foundational verification frameworks embedded in Coq: Hoare Type Theory (HTT), Iris, and Verified Software Toolchain (VST), producing concise and efficient machinecheckable proofs for characteristic synthesis benchmarks.more » « less

One vision for program synthesis, and specifically for programming by example (PBE), is an interactive programmer's assistant, integrated into the development environment. To make program synthesis practical for interactive use, prior work on SmallStep Live PBE has proposed to limit the scope of synthesis to small code snippets, and enable the users to provide local specifications for those snippets. This paradigm, however, does not work well in the presence of loops. We present LooPy, a synthesizer integrated into a live programming environment, which extends SmallStep Live PBE to work inside loops and scales it up to synthesize larger code snippets, while remaining fast enough for interactive use. To allow users to effectively provide examples at various loop iterations, even when the loop body is incomplete, LooPy makes use of live execution , a technique that leverages the programmer as an oracle to step over incomplete parts of the loop. To enable synthesis of loop bodies at interactive speeds, LooPy introduces Intermediate State Graph , a new data structure, which compactly represents a large space of code snippets composed of multiple assignment statements and conditionals. We evaluate LooPy empirically using benchmarks from competitive programming and previous synthesizers, and show that it can solve a wide variety of synthesis tasks at interactive speeds. We also perform a small qualitative user study which shows that LooPy's blocklevel specifications are easy for programmers to provide.more » « less