skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.
Attention:The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 7:00 AM ET to 7:30 AM ET on Friday, April 24 due to maintenance. We apologize for the inconvenience.


Title: Grove: A Bidirectionally Typed Collaborative Structure Editor Calculus
Version control systems typically rely on apatch language, heuristicpatch synthesis algorithmslike diff, andthree-way merge algorithms. Standard patch languages and merge algorithms often fail to identify conflicts correctly when there are multiple edits to one line of code or code is relocated. This paper introduces Grove, a collaborative structure editor calculus that eliminates patch synthesis and three-way merge algorithms entirely. Instead, patches are derived directly from the log of the developer’s edit actions and all edits commute, i.e. the repository state forms a commutative replicated data type (CmRDT). To handle conflicts that can arise due to code relocation, the core datatype in Grove is a labeled directed multi-graph with uniquely identified vertices and edges. All edits amount to edge insertion and deletion, with deletion being permanent. To support tree-based editing, we define a decomposition from graphs intogroves, which are a set of syntax trees with conflicts—including local, relocation, and unicyclic relocation conflicts—represented explicitly using holes and references between trees. Finally, we define a type error localization system for groves that enjoys atotalityproperty, i.e. all editor states in Grove are statically meaningful, so developers can use standard editor services while working to resolve these explicitly represented conflicts. The static semantics is defined as a bidirectional marking system in line with recent work, with gradual typing employed to handle situations where errors and conflicts prevent type determination. We then layer on a unification-based type inference system to opportunistically fill type holes and fail gracefully when no solution exists. We mechanize the metatheory of Grove using the Agda theorem prover. We implement these ideas as theGrove Workbench, which generates the necessary data structures and algorithms in OCaml given a syntax tree specification.  more » « less
Award ID(s):
2238744
PAR ID:
10593045
Author(s) / Creator(s):
; ; ; ; ;
Publisher / Repository:
Proceedings of the ACM on Programming Languages
Date Published:
Journal Name:
Proceedings of the ACM on Programming Languages
Volume:
9
Issue:
POPL
ISSN:
2475-1421
Page Range / eLocation ID:
2176 to 2204
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. In collaborative software development, programmers create software branches to add features and fix bugs tentatively, and then merge branches to integrate edits. When edits from different branches textually overlap (i.e.,textual conflicts) or lead to compilation and runtime errors (i.e.,build and test conflicts), it is challenging for developers to remove such conflicts. Prior work proposed tools to detect and solve conflicts. They investigate how conflicts relate to code smells and the software development process. However, many questions are still not fully investigated, such as what types of conflicts exist in real-world applications and how developers or tools handle them. For this article, we used automated textual merge, compilation, and testing to reveal three types of conflicts in 208 open-source repositories: textual conflicts, build conflicts (i.e., conflicts causing build errors), and test conflicts (i.e., conflicts triggering test failures). We manually inspected 538 conflicts and their resolutions to characterize merge conflicts from different angles. Our analysis revealed three interesting phenomena. First, higher-order conflicts (i.e., build and test conflicts) are harder to detect and resolve, while existing tools mainly focus on textual conflicts. Second, developers manually resolved most higher-order conflicts by applying similar edits to multiple program locations; their conflict resolutions share common editing patterns implying great opportunities for future tool design. Third, developers resolved 64% of true textual conflicts by keeping complete edits from either a left or right branch. Unlike prior studies, our research for the first time thoroughly characterizes three types of conflicts, with a special focus on higher-order conflicts and limitations of existing tool design. Our work will shed light on future research of software merge. 
    more » « less
  2. In software merge, the edits from different branches can textually overlap (i.e., textual conflicts) or cause build and test errors (i.e., build and test conflicts), jeopardizing programmer productivity and software quality. Existing tools primarily focus on textual conflicts; few tools detect higher-order conflicts (i.e., build and test conflicts). However, existing detectors of build conflicts are limited. Due to their heavy usage of automatic build, current detectors (e.g., Crystal) only report build errors instead of identifying the root causes; developers have to manually locate conflicting edits. These detectors only help when the branches-to-merge have no textual conflict. We present a new static analysis-based approach Bucond (“build conflict detector”). Given three code versions in a merging scenario: base b, left l, and right r, Bucond models each version as a graph, and compares graphs to extract entity-related edits (e.g., class renaming) in l and r. We believe that build conflicts occur when certain edits are co-applied to related entities between branches. Bucond realizes this insight via pattern matching to identify any cross-branch edit combination that can trigger build conflicts (e.g., one branch adds a reference to field F while the other branch removes F). We systematically explored and devised 57 patterns, covering 97% of the build conflicts in our experiments. Our evaluation shows Bucond to complement build-based detectors, as it (1) detects conflicts with 100% precision and 88%–100% recall, (2) locates conflicting edits, and (3) works well when those detectors do not. 
    more » « less
  3. Abstract We study the ring of regular functions on the space of planar electrical networks, which we coin thegrove algebra. This algebra is an electrical analog of the Plücker ring studied classically in invariant theory. We develop the combinatorics of double groves to study the grove algebra, and find a quadratic Gröbner basis for the grove ideal. 
    more » « less
  4. Automatically transforming programs is hard, yet critical for automated program refactoring, rewriting, and repair. Multi-language syntax transformation is especially hard due to heterogeneous representations in syntax, parse trees, and abstract syntax trees (ASTs). Our insight is that the problem can be decomposed such that (1) a common grammar expresses the central context-free language (CFL) properties shared by many contemporary languages and (2) open extension points in the grammar allow customizing syntax (e.g., for balanced delimiters) and hooks in smaller parsers to handle language-specific syntax (e.g., for comments). Our key contribution operationalizes this decomposition using a Parser Parser combinator (PPC), a mechanism that generates parsers for matching syntactic fragments in source code by parsing declarative user-supplied templates. This allows our approach to detach from translating input programs to any particular abstract syntax tree representation, and lifts syntax rewriting to a modularly-defined parsing problem. A notable effect is that we skirt the complexity and burden of defining additional translation layers between concrete user input templates and an underlying abstract syntax representation. We demonstrate that these ideas admit efficient and declarative rewrite templates across 12 languages, and validate effectiveness of our approach by producing correct and desirable lightweight transformations on popular real-world projects (over 50 syntactic changes produced by our approach have been merged into 40+). Our declarative rewrite patterns require an order of magnitude less code compared to analog implementations in existing, language-specific tools. 
    more » « less
  5. Extending prior work in Graf (2018, 2020, 2022c), I show that movement is tier-based strictly local (TSL) even if one analyzes it as a transformation, i.e. a tree transduction from derivation trees to output trees. I define input strictly local (ISL) tree-to-tree transductions with (lexical) TSL tests as a tier-based extension of ISL tree-to-tree transductions. TSL tests allow us to attach each mover to all its landing sites. In general, this class of transductions fails to attach each mover to its final landing site to the exclusion of all its intermediate landing sites, which is crucial for producing output trees with the correct string yield. The problem is avoided, though, if syntax enforces a variant of the Ban on Improper Movement. Subregular complexity thus provides a novel motivation for core restrictions on movement while also shedding new light on the choice between copies and traces in syntax. 
    more » « less