skip to main content


Title: LooPy: interactive program synthesis with control structures
One vision for program synthesis, and specifically for programming by example (PBE), is an interactive programmer's assistant, integrated into the development environment. To make program synthesis practical for interactive use, prior work on Small-Step Live PBE has proposed to limit the scope of synthesis to small code snippets, and enable the users to provide local specifications for those snippets. This paradigm, however, does not work well in the presence of loops. We present LooPy, a synthesizer integrated into a live programming environment, which extends Small-Step Live PBE to work inside loops and scales it up to synthesize larger code snippets, while remaining fast enough for interactive use. To allow users to effectively provide examples at various loop iterations, even when the loop body is incomplete, LooPy makes use of live execution , a technique that leverages the programmer as an oracle to step over incomplete parts of the loop. To enable synthesis of loop bodies at interactive speeds, LooPy introduces Intermediate State Graph , a new data structure, which compactly represents a large space of code snippets composed of multiple assignment statements and conditionals. We evaluate LooPy empirically using benchmarks from competitive programming and previous synthesizers, and show that it can solve a wide variety of synthesis tasks at interactive speeds. We also perform a small qualitative user study which shows that LooPy's block-level specifications are easy for programmers to provide.  more » « less
Award ID(s):
2107397
NSF-PAR ID:
10357425
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
Proceedings of the ACM on Programming Languages
Volume:
5
Issue:
OOPSLA
ISSN:
2475-1421
Page Range / eLocation ID:
1 to 29
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Programming-by-example (PBE) is a synthesis paradigm that allows users to generate functions by simply providing input-output examples. While a promising interaction paradigm, synthesis is still too slow for realtime interaction and more widespread adoption. Existing approaches to PBE synthesis have used automated reasoning tools, such as SMT solvers, as well as works applying machine learning techniques. At its core, the automated reasoning approach relies on highly domain specific knowledge of programming languages. On the other hand, the machine learning approaches utilize the fact that when working with program code, it is possible to generate arbitrarily large training datasets. In this work, we propose a system for using machine learning in tandem with automated reasoning techniques to solve Syntax Guided Synthesis (SyGuS) style PBE problems. By preprocessing SyGuS PBE problems with a neural network, we can use a data driven approach to reduce the size of the search space, then allow automated reasoning-based solvers to more quickly find a solution analytically. Our system is able to run atop existing SyGuS PBE synthesis tools, decreasing the runtime of the winner of the 2019 SyGuS Competition for the PBE Strings track by 47.65% to outperform all of the competing tools. 
    more » « less
  2. Facility with foundational practices in computer science (CS) is increasingly recognized as critical for the 21st century workforce. Developing this capacity and broadening participation in CS disciplines will require learning experiences that can engage a larger and more diverse student population (Margolis et al., 2008). One promising approach involves including CS concepts and practices in required subjects like science. Yet, research on the scalability of educational innovations consistently demonstrates that their successful uptake in formal classrooms depends on teachers’ perceived alignment of the innovations with their goals and expectations for student learning, as well as with the specific needs of their school context and culture (Blumenfeld et al., 2000; Penuel et al., 2007; Bernstein et al., 2016). Research is nascent, however, about how exactly to achieve this alignment and thereby position integrated instructional models for uptake at scale. To contribute to this understanding, we are developing and studying two units for core middle school science classrooms, known as Coding Science Internships. The units are designed to support broader participation in CS, with a particular emphasis on females, by expanding students’ perception of the nature and value of coding. CS and science learning are integrated through a simulated internship model, in which students, as interns, apply science knowledge and use computer programming as a tool to address real-world problems. In one unit, students gain first-hand experience with sequences, loops, and conditionals as they program and debug an interactive scientific model of a coral reef ecosystem under threat. The second unit engages students in learning concepts related to data analysis and visualization, abstraction, and modularity as they code data visualizations using real EPA air quality data. A core goal for both units is to provide students experience with some of the increasingly prevalent ways that computer science is integrated into the work of scientists. 
    more » « less
  3. Programming by example allows users to create programs without coding, by simply specifying input and output pairs.We introduce the problem of digital signal processing programming by example (DSP-PBE), where users specify input and output wave files, and a tool automatically synthesizes a program that transforms the input to the output. This program can then be applied to new wave files, giving users a new way to interact with music and program code. We formally define the problem of DSP-PBE, and provide a first implementation of a solution that can handle synthesis over commutative filters. 
    more » « less
  4. Hirschfeld, Robert ; Pape, Tobias (Ed.)
    Program synthesis promises to help software developers with everyday tasks by generating code snippets automatically from input-output examples and other high-level specifications. The conventional wisdom is that a synthesizer must always satisfy the specification exactly. We conjecture that this all-or-nothing paradigm stands in the way of adopting program synthesis as a developer tool: in practice, the user-written specification often contains errors or is simply too hard for the synthesizer to solve within a reasonable time; in these cases, the user is left with a single over-fitted result or, more often than not, no result at all. In this paper we propose a new program synthesis paradigm we call best-effort program synthesis, where the synthesizer returns a ranked list of partially-valid results, i.e. programs that satisfy some part of the specification. To support this paradigm, we develop best-effort enumeration, a new synthesis algorithm that extends a popular program enumeration technique with the ability to accumulate and return multiple partially-valid results with minimal overhead. We implement this algorithm in a tool called BESTER, and evaluate it on 79 synthesis benchmarks from the literature. Contrary to the conventional wisdom, our evaluation shows that BESTER returns useful results even when the specification is flawed or too hard: i) for all benchmarks with an error in the specification, the top three BESTER results contain the correct solution, and ii) for most hard benchmarks, the top three results contain non-trivial fragments of the correct solution. We also performed an exploratory user study, which confirms our intuition that partially-valid results are useful: the study shows that programmers use the output of the synthesizer for comprehension and often incorporate it into their solutions. 
    more » « less
  5. Platforms like Stack Overflow and GitHub's gist system promote the sharing of ideas and programming techniques via the distribution of code snippets designed to illustrate particular tasks. Python, a popular and fast-growing programming language, sees heavy use on both sites, with nearly one million questions asked on Stack Overflow and 400 thousand public gists on GitHub. Unfortunately, around 75% of the Python example code shared through these sites cannot be directly executed. When run in a clean environment, over 50% of public Python gists fail due to an import error for a missing library. We present DockerizeMe, a technique for inferring the dependencies needed to execute a Python code snippet without import error. DockerizeMe starts with offline knowledge acquisition of the resources and dependencies for popular Python packages from the Python Package Index (PyPI). It then builds Docker specifications using a graph-based inference procedure. Our inference procedure resolves import errors in 892 out of nearly 3,000 gists from the Gistable dataset for which Gistable's baseline approach could not find and install all dependencies. 
    more » « less