skip to main content


Title: RbSyn: type- and effect-guided program synthesis
In recent years, researchers have explored component-based synthesis, which aims to automatically construct programs that operate by composing calls to existing APIs. However, prior work has not considered efficient synthesis of methods with side effects, e.g., web app methods that update a database. In this paper, we introduce RbSyn, a novel type- and effect-guided synthesis tool for Ruby. An RbSyn synthesis goal is specified as the type for the target method and a series of test cases it must pass. RbSyn works by recursively generating well-typed candidate method bodies whose write effects match the read effects of the test case assertions. After finding a set of candidates that separately satisfy each test, RbSyn synthesizes a solution that branches to execute the correct candidate code under the appropriate conditions. We formalize RbSyn on a core, object-oriented language λsyn and describe how the key ideas of the model are scaled-up in our implementation for Ruby. We evaluated RbSyn on 19 benchmarks, 12 of which come from popular, open-source Ruby apps. We found that RbSyn synthesizes correct solutions for all benchmarks, with 15 benchmarks synthesizing in under 9 seconds, while the slowest benchmark takes 83 seconds. Using observed reads to guide synthesize is effective: using type-guidance alone times out on 10 of 12 app benchmarks. We also found that using less precise effect annotations leads to worse synthesis performance. In summary, we believe type- and effect-guided synthesis is an important step forward in synthesis of effectful methods from test cases.  more » « less
Award ID(s):
1900563 1846350
NSF-PAR ID:
10315944
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    We consider the problem of type-directed component-based synthesis where, given a set of (typed) components and a query type , the goal is to synthesize a term that inhabits the query. Classical approaches based on proof search in intuitionistic logics do not scale up to the standard libraries of modern languages, which span hundreds or thousands of components. Recent graph reachability based methods proposed for Java do scale, but only apply to monomorphic data and components: polymorphic data and components infinitely explode the size of the graph that must be searched, rendering synthesis intractable. We introduce type-guided abstraction refinement (TYGAR), a new approach for scalable type-directed synthesis over polymorphic datatypes and components. Our key insight is that we can overcome the explosion by building a graph over abstract types which represent a potentially unbounded set of concrete types. We show how to use graph reachability to search for candidate terms over abstract types, and introduce a new algorithm that uses proofs of untypeability of ill-typed candidates to iteratively refine the abstraction until a well-typed result is found. We have implemented TYGAR in H+, a tool that takes as input a set of Haskell libraries and a query type, and returns a Haskell term that uses functions from the provided libraries to implement the query type. Our support for polymorphism allows H+ to work with higher-order functions and type classes, and enables more precise queries due to parametricity. We have evaluated H+ on 44 queries using a set of popular Haskell libraries with a total of 291 components. H+ returns an interesting solution within the first five results for 32 out of 44 queries. Our results show that TYGAR allows H+ to rapidly return well-typed terms, with the median time to first solution of just 1.4 seconds. Moreover, we observe that gains from iterative refinement over exhaustive enumeration are more pronounced on harder queries. 
    more » « less
  2. While mixed integer linear programming (MILP) solvers are routinely used to solve a wide range of important science and engineering problems, it remains a challenging task for end users to write correct and efficient MILP constraints, especially for problems specified using the inherently non-linear Boolean logic operations. To overcome this challenge, we propose a syntax guided synthesis (SyGuS) method capable of generating high-quality MILP constraints from the specifications expressed using arbitrary combinations of Boolean logic operations. At the center of our method is an extensible domain specification language (DSL) whose expressiveness may be improved by adding new integer variables as decision variables, together with an iterative procedure for synthesizing linear constraints from non-linear Boolean logic operations using these integer variables. To make the synthesis method efficient, we also propose an over-approximation technique for soundly proving the correctness of the synthesized linear constraints, and an under-approximation technique for safely pruning away the incorrect constraints. We have implemented and evaluated the method on a wide range of benchmark specifications from statistics, machine learning, and data science applications. The experimental results show that the method is efficient in handling these benchmarks, and the quality of the synthesized MILP constraints is close to, or higher than, that of manually-written constraints in terms of both compactness and solving time.

     
    more » « less
  3. The test suites of an Android app should take advantage of different types of tests including end-to-end tests, which validate user flows, and unit tests, which provide focused executions for debugging. App developers have two main options when creating unit tests: create unit tests that run on a device (either physical or emulated) or create unit tests that run on a development machine’s Java Virtual Machine (JVM). Unit tests that run on a device are not really focused, as they use the full implementation of the Android framework. Moreover, they are fairly slow to execute, requiring the Android system as the runtime. Unit tests that run on the JVM, instead, are more focused and run more efficiently but require developers to suitably handle the coupling between the app under test and the Android framework. To help developers in creating focused unit tests that run on the JVM, we propose a novel technique called ARTISAN based on the idea of test carving. The technique (i) traces the app execution during end-to-end testing on Android devices, (ii) identifies focal methods to test, (iii) carves the necessary preconditions for testing those methods, (iv) creates suitable test doubles for the Android framework, and (v) synthesizes executable unit tests that can run on the JVM. We evaluated ARTISAN using 152 end-to-end tests from five apps and observed that ARTISAN can generate unit tests that cover a significant portion of the code exercised by the end-to-end tests (i.e., 45% of the starting statement coverage on average) and does so in a few minutes. 
    more » « less
  4. INTRODUCTION: In practice, the use of a whip stitch versus a locking stitch in anterior cruciate ligament (ACL) graft preparation is based on surgeon preference. Those who prefer efficiency and shorter stitch time typically choose a whip stitch, while those who require improved biomechanical properties select a locking stitch, the gold standard of which is the Krackow method. The purpose of this study was to evaluate a novel suture needle design that can be used to perform two commonly used stitch methods, a whip stitch, and a locking stitch, by comparing the speed of graft preparation and biomechanical properties. It was hypothesized that adding a locking mechanism to the whip stitch would improve biomechanical performance but would also require more time to complete due to additional steps required for the locking technique. METHODS: Graft preparation was performed by four orthopaedic surgeons of different training levels where User 1 and User 2 were both attendings and User’s 3 and 4 were both fellows. A total of 24 matched pair cadaveric knees were dissected and a total of 48 semitendinosus tendons were harvested. All grafts were standardized to the same size. Tendons were randomly divided into 4 groups (12 tendons per group) such that each User performed analogous stitch on matched pair: Group 1, User 1 and User 3 performed whip stitches; Group 2, User 1 and User 3 performed locking stitches; Group 3, User 2 and User 4 performed whip stitches; Group 4, User 2 and User 4 performed locking stitches. For instrumentation, the two ends of tendon grafts were clamped to a preparation stand. A skin marker was used to mark five evenly spaced points, 0.5 cm apart, as a guide to create a 5-stitch series. The stitches were performed with EasyWhip, a novel two-part suture needle which allows one to do both a traditional whip stitch and a locking whip stitch, referred to as WhipLock (Figure 1). The speed for graft preparation was timed for each User. Biomechanical testing was performed using a servohydraulic testing machine (MTS Bionix) equipped with a 5kN load cell (Figure 2). A standardized length of tendon, 10 cm, was coupled to the MTS actuator by passing it through a cryoclamp cooled by dry ice to a temperature of -5°C. All testing samples were pre-conditioned to normalize viscoelastic effects and testing variability through application of cyclical loading to 25-100 N for three cycles. The samples were then held at 89 N for 15 minutes. Thereafter, the samples were loaded to 50-200 N for 500 cycles at 1 Hz. If samples survived, they were ramped to failure at 20 mm/min. Displacement and force data was collected throughout testing. Metrics of interest were peak-to-peak displacement (mm), stiffness (N/mm), ultimate failure load (N) and failure mode. Data are presented as averages and standard deviations. A Wilcoxon signed-rank test was used to evaluate the groups for time to complete stitch and biomechanical performance. Statistical significance was set at P = .05. RESULTS SECTION: In Group 1, the time to complete the whip stitch was not significantly different between User 1 and User 3, where the average completion time was 1 min 13 sec. Similarly, there were no differences between Users when performing the WhipLock (Group 2) with an average time of 1 min 49 sec. In Group 3 (whip stitch), User 2 took 1 min 48 sec to complete the whip stitch, whereas User 4 took 1 min 25 sec (p=.033). The time to complete the WhipLock stitch (Group 4) was significantly different, where User 2 took 3 min and 44 sec, while User 4 only took 2 min 3 sec (p=.002). Overall, the whip stitch took on average 1 min 25 sec whereas the WhipLock took 2 min 20 sec (p=.001). For whip stitch constructs, no differences were found between Users and all stitches were biomechanically equivalent. Correspondingly, for WhipLock stitches, no differences were found between Users and all suture constructs were likewise biomechanically equivalent. Averages for peak-to-peak displacement (mm), stiffness (N/mm), and ultimate failure load (N) are presented in Table 1. Moreover, when the two stitch methods were compared, the WhipLock constructs significantly increased stiffness by 25% (p <.001), increased ultimate failure load by 35% (p<.001) and reduced peak-to-peak displacement by 55% (p=.001). The common mode of failure for grafts with whip stitch failed by suture pullout from tendon (18/24), where a few instances occurred by suture breakage (6/24). Tendon grafts with WhipLock stitch commonly failed by suture breakage (22/24), where two instances of combined tendon tear and suture breakage were noted. DISCUSSION: The WhipLock stitch significantly increased average construct stiffness and ultimate failure load, while significantly reducing the peak-to- peak displacement compared to the whip stitch. These added strength benefits of the WhipLock stitch took 55 seconds more to complete than the whip stitch. No statistically significant difference in biomechanical performance was found between the Users. Data suggests equivalent stitch performance regardless of the time to complete stitch and surgeon training level. SIGNIFICANCE/CLINICAL RELEVANCE: Clinically, having a suture needle device available which can be used to easily perform different constructs including one with significant strength advantages regardless of level of experience is of benefit. 
    more » « less
  5. This article presents resource-guided synthesis, a technique for synthesizing recursive programs that satisfy both a functional specification and a symbolic resource bound. The technique is type-directed and rests upon a novel type system that combines polymorphic refinement types with potential annotations of automatic amortized resource analysis. The type system enables efficient constraint-based type checking and can express precise refinement-based resource bounds. The proof of type soundness shows that synthesized programs are correct by construction. By tightly integrating program exploration and type checking, the synthesizer can leverage the user-provided resource bound to guide the search, eagerly rejecting incomplete programs that consume too many resources. An implementation in the resource-guided synthesizer ReSyn is used to evaluate the technique on a range of recursive data structure manipulations. The experiments show that ReSyn synthesizes programs that are asymptotically more efficient than those generated by a resource-agnostic synthesizer. Moreover, synthesis with ReSyn is faster than a naive combination of synthesis and resource analysis. ReSyn is also able to generate implementations that have a constant resource consumption for fixed input sizes, which can be used to mitigate side-channel attacks. 
    more » « less