skip to main content


This content will become publicly available on October 18, 2024

Title: Trustworthy Formal Natural Language Specifications
Interactive proof assistants are computer programs carefully constructed to check a human-designed proof of a mathematical claim with high confidence in the implementation. However, this only validates truth of a formal claim, which may have been mistranslated from a claim made in natural language. This is especially problematic when using proof assistants to formally verify the correctness of software with respect to a natural language specification. The translation from informal to formal remains a challenging, time-consuming process that is difficult to audit for correctness. This paper shows that it is possible to build support for specifications written in expressive subsets of natural language, within existing proof assistants, consistent with the principles used to establish trust and auditability in proof assistants themselves. We implement a means to provide specifications in a modularly extensible formal subset of English, and have them automatically translated into formal claims, entirely within the Lean proof assistant. Our approach is extensible (placing no permanent restrictions on grammatical structure), modular (allowing information about new words to be distributed alongside libraries), and produces proof certificates explaining how each word was interpreted and how the sentence's structure was used to compute the meaning. We apply our prototype to the translation of various English descriptions of formal specifications from a popular textbook into Lean formalizations; all can be translated correctly with a modest lexicon with only minor modifications related to lexicon size.  more » « less
Award ID(s):
2220991
NSF-PAR ID:
10469926
Author(s) / Creator(s):
;
Publisher / Repository:
ACM
Date Published:
Page Range / eLocation ID:
50 to 70
Subject(s) / Keyword(s):
["Formal specification","natural language programming","natural language specification","categorial grammars","proof assistants"]
Format(s):
Medium: X
Location:
Cascais Portugal
Sponsoring Org:
National Science Foundation
More Like this
  1. Piotr Faliszewski ; Viviana Mascardi (Ed.)
    Recent success in reinforcement learning (RL) has brought renewed attention to the design of reward functions by which agent behavior is reinforced or deterred. Manually designing reward functions is tedious and error-prone. An alternative approach is to specify a formal, unambiguous logic requirement, which is automatically translated into a reward function to be learned from. Omega-regular languages, of which Linear Temporal Logic (LTL) is a subset, are a natural choice for specifying such requirements due to their use in verification and synthesis. However, current techniques based on omega-regular languages learn in an episodic manner whereby the environment is periodically reset to an initial state during learning. In some settings, this assumption is challenging or impossible to satisfy. Instead, in the continuing setting the agent explores the environment without resets over a single lifetime. This is a more natural setting for reasoning about omega-regular specifications defined over infinite traces of agent behavior. Optimizing the average reward instead of the usual discounted reward is more natural in this case due to the infinite-horizon objective that poses challenges to the convergence of discounted RL solutions. We restrict our attention to the omega-regular languages which correspond to absolute liveness specifications. These specifications cannot be invalidated by any finite prefix of agent behavior, in accordance with the spirit of a continuing problem. We propose a translation from absolute liveness omega-regular languages to an average reward objective for RL. Our reduction can be done on-the-fly, without full knowledge of the environment, thereby enabling the use of model-free RL algorithms. Additionally, we propose a reward structure that enables RL without episodic resetting in communicating MDPs, unlike previous approaches. We demonstrate empirically with various benchmarks that our proposed method of using average reward RL for continuing tasks defined by omega-regular specifications is more effective than competing approaches that leverage discounted RL. 
    more » « less
  2. A growing body of work studies how to answer a question or verify a claim by generating a natural language “proof:” a chain of deductive inferences yielding the answer based on a set of premises. However, these methods can only make sound deductions when they follow from evidence that is given. We propose a new system that can handle the underspecified setting where not all premises are stated at the outset; that is, additional assumptions need to be materialized to prove a claim. By using a natural language generation model to abductively infer a premise given another premise and a conclusion, we can impute missing pieces of evidence needed for the conclusion to be true. Our system searches over two fringes in a bidirectional fashion, interleaving deductive (forward-chaining) and abductive (backward-chaining) generation steps. We sample multiple possible outputs for each step to achieve coverage of the search space, at the same time ensuring correctness by filtering low-quality generations with a round-trip validation procedure. Results on a modified version of the EntailmentBank dataset and a new dataset called Everyday Norms: Why Not? Show that abductive generation with validation can recover premises across in- and out-of-domain settings. 
    more » « less
  3. Template-Coq 5 is a plugin for Coq, originally implemented by Malecha, which provides a reifier for Coq terms and global declara- tions, as represented in the Coq kernel, as well as a denotation command. Initially, it was developed for the purpose of writing functions on Coq’s AST in Gallina. Recently, it was used in the CertiCoq certified compiler project, as its front-end language, to derive parametricity properties, and to extract Coq terms to a CBV λ-calculus. However, the syntax lacked semantics, be it typing semantics or operational semantics, which should reflect, as formal specifications in Coq, the semantics of Coq’s type theory itself. The tool was also rather bare bones, providing only rudimentary quoting and unquoting commands. We generalize it to han- dle the entire Calculus of Inductive Constructions (CIC), as implemented by Coq, including the kernel’s declaration structures for definitions and inductives, and implement a monad for general manipulation of Coq’s logical environment. We demonstrate how this setup allows Coq users to define many kinds of general purpose plugins, whose correctness can be readily proved in the system itself, and that can be run efficiently after extraction. We give a few examples of implemented plugins, including a parametricity translation. We also advocate the use of Template-Coq as a foundation for higher-level tools. 
    more » « less
  4. Embedding computation in biochemical environments incompatible with traditional electronics is expected to have a wide-ranging impact in synthetic biology, medicine, nanofabrication, and other fields. Natural biochemical systems are typically modeled by chemical reaction networks (CRNs) which can also be used as a specification language for synthetic chemical computation. In this paper, we identify a syntactically checkable class of CRNs called noncompetitive (NC) whose equilibria are absolutely robust to reaction rates and kinetic rate law, because their behavior is captured solely by their stoichiometric structure. In spite of the inherently parallel nature of chemistry, the robustness property allows for programming as if each reaction applies sequentially. We also present a technique to program NC-CRNs using well-founded deep learning methods, showing a translation procedure from rectified linear unit (ReLU) neural networks to NC-CRNs. In the case of binary weight ReLU networks, our translation procedure is surprisingly tight in the sense that a single bimolecular reaction corresponds to a single ReLU node and vice versa. This compactness argues that neural networks may be a fitting paradigm for programming rate-independent chemical computation. As proof of principle, we demonstrate our scheme with numerical simulations of CRNs translated from neural networks trained on traditional machine learning datasets, as well as tasks better aligned with potential biological applications including virus detection and spatial pattern formation. 
    more » « less
  5. As technical computing software, such as MATLAB and SciPy, has gained popularity, ecosystems of interdependent software solutions and communities have formed around these technologies.The development and maintenance of these technical computing ecosystems requires expertise in both software engineering and the underlying technical domain. The inherently interdisciplinary nature of these ecosystems presents unique challenges and opportunities that shape software development practices.

    Proof assistants, a type of technical computing software, aid users in the creation of formal proofs. In order to examine the influence of the underlying technical domain --- mathematics --- on the development of proof assistant ecosystems, we mined participant activity data from the code repositories and social channels of three popular proof assistants: Lean, Coq, Isabelle. Despite having a shared technical domain, we found little cross-pollination between contributors to the proof assistants. Additionally, we found that most long-term developers focused solely on technical work and did not participate in official social channels. We also found that proof assistant developers specialized into technical subfields. However, the proportion of specialists varied between ecosystems. We did not find evidence that these specialties contributed to fractures within the ecosystems. We discuss the implications of these results on the long-term health and sustainability of proof assistant ecosystems.

    This artifact contains the scripts and dataset that support an MSR 2024 article.

     
    more » « less