skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Operation-Based Refactoring-Aware Merging: An Empirical Evaluation
Dealing with merge conflicts in version control systems is a challenging task for software developers. Resolving merge conflicts is a time-consuming and error-prone process, which distracts developers from important tasks. Recent work shows that refactorings are often involved in merge conflicts and that refactoring-related conflicts tend to be larger, making them harder to resolve. In the literature, there are two refactoring-aware merging techniques that claim to automatically resolve refactoring-related conflicts; however, these two techniques have never been empirically compared. In this paper, we present RefMerge, a rejuvenated Java-based design and implementation of the first technique, which is an operation-based refactoring-aware merging algorithm. We compare RefMerge to Git and the state-of-the-art graph-based refactoring-aware merging tool, IntelliMerge, on 2,001 merge scenarios with refactoring-related conflicts from 20 open-source projects. We find that RefMerge resolves or reduces conflicts in 497 (25%) merge scenarios while increasing conflicting LOC in only 214 (11%) scenarios. On the other hand, we find that IntelliMerge resolves or reduces conflicts in 478 (24%) merge scenarios but increases conflicting LOC in 597 (30%) merge scenarios. We additionally conduct a qualitative analysis of the differences between the three merging algorithms and provide insights of the strengths and weaknesses of each tool. We find that while IntelliMerge does well with ordering and formatting conflicts, it struggles with class-level refactorings and scenarios with several refactorings. On the other hand, RefMerge is resilient to the number of refactorings in a merge scenario, but we find that RefMerge introduces conflicts when inverting move-related refactorings.  more » « less
Award ID(s):
2213767
PAR ID:
10471900
Author(s) / Creator(s):
; ;
Publisher / Repository:
IEEE
Date Published:
Journal Name:
IEEE Transactions on Software Engineering
Volume:
49
Issue:
4
ISSN:
0098-5589
Page Range / eLocation ID:
2698 to 2721
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. In software merge, the edits from different branches can textually overlap (i.e., textual conflicts) or cause build and test errors (i.e., build and test conflicts), jeopardizing programmer productivity and software quality. Existing tools primarily focus on textual conflicts; few tools detect higher-order conflicts (i.e., build and test conflicts). However, existing detectors of build conflicts are limited. Due to their heavy usage of automatic build, current detectors (e.g., Crystal) only report build errors instead of identifying the root causes; developers have to manually locate conflicting edits. These detectors only help when the branches-to-merge have no textual conflict. We present a new static analysis-based approach Bucond (“build conflict detector”). Given three code versions in a merging scenario: base b, left l, and right r, Bucond models each version as a graph, and compares graphs to extract entity-related edits (e.g., class renaming) in l and r. We believe that build conflicts occur when certain edits are co-applied to related entities between branches. Bucond realizes this insight via pattern matching to identify any cross-branch edit combination that can trigger build conflicts (e.g., one branch adds a reference to field F while the other branch removes F). We systematically explored and devised 57 patterns, covering 97% of the build conflicts in our experiments. Our evaluation shows Bucond to complement build-based detectors, as it (1) detects conflicts with 100% precision and 88%–100% recall, (2) locates conflicting edits, and (3) works well when those detectors do not. 
    more » « less
  2. Background: Code refactoring is widely recognized as an essential software engineering practice to improve the understandability and maintainability of the source code. The Extract Method refactoring is considered as “Swiss army knife” of refactorings, as developers often apply it to improve their code quality, e.g., decompose long code fragments, reduce code complexity, eliminate duplicated code, etc. In recent years, several studies attempted to recommend Extract Method refactorings allowing the collection, analysis, and revelation of actionable data-driven insights about refactoring practices within software projects. Aim: In this paper, we aim at reviewing the current body of knowledge on existing Extract Method refactoring research and explore their limitations and potential improvement opportunities for future research efforts. That is, Extract Method is considered one of the most widely-used refactorings, but difficult to apply in practice as it involves low-level code changes such as statements, variables, parameters, return types, etc. Hence, researchers and practitioners begin to be aware of the state-of-the-art and identify new research opportunities in this context. Method: We review the body of knowledge related to Extract Method refactoring in the form of a systematic literature review (SLR). After compiling an initial pool of 1,367 papers, we conducted a systematic selection and our final pool included 83 primary studies. We define three sets of research questions and systematically develop and refine a classification schema based on several criteria including their methodology, applicability, and degree of automation. Results: The results construct a catalog of 83 Extract Method approaches indicating that several techniques have been proposed in the literature. Our results show that: (i) 38.6% of Extract Method refactoring studies primarily focus on addressing code clones; (ii) Several of the Extract Method tools incorporate the developer's involvement in the decision-making process when applying the method extraction, and (iii) the existing benchmarks are heterogeneous and do not contain the same type of information, making standardizing them for the purpose of benchmarking difficult. Conclusions: Our study serves as an “index” to the body of knowledge in this area for researchers and practitioners in determining the Extract Method refactoring approach that is most appropriate for their needs. Our findings also empower the community with information to guide the future development of refactoring tools. 
    more » « less
  3. In collaborative software development, programmers create software branches to add features and fix bugs tentatively, and then merge branches to integrate edits. When edits from different branches textually overlap (i.e.,textual conflicts) or lead to compilation and runtime errors (i.e.,build and test conflicts), it is challenging for developers to remove such conflicts. Prior work proposed tools to detect and solve conflicts. They investigate how conflicts relate to code smells and the software development process. However, many questions are still not fully investigated, such as what types of conflicts exist in real-world applications and how developers or tools handle them. For this article, we used automated textual merge, compilation, and testing to reveal three types of conflicts in 208 open-source repositories: textual conflicts, build conflicts (i.e., conflicts causing build errors), and test conflicts (i.e., conflicts triggering test failures). We manually inspected 538 conflicts and their resolutions to characterize merge conflicts from different angles. Our analysis revealed three interesting phenomena. First, higher-order conflicts (i.e., build and test conflicts) are harder to detect and resolve, while existing tools mainly focus on textual conflicts. Second, developers manually resolved most higher-order conflicts by applying similar edits to multiple program locations; their conflict resolutions share common editing patterns implying great opportunities for future tool design. Third, developers resolved 64% of true textual conflicts by keeping complete edits from either a left or right branch. Unlike prior studies, our research for the first time thoroughly characterizes three types of conflicts, with a special focus on higher-order conflicts and limitations of existing tool design. Our work will shed light on future research of software merge. 
    more » « less
  4. Block-based programming has been overwhelmingly successful in revitalizing introductory computing education and in facilitating end-user development. However, poor code quality makes block-based programs hard to understand, modify, and reuse, thus hurting the educational and productivity effectiveness of blocks. There is great potential benefit in empowering programmers in this domain to systematically improve the code quality of their projects. Refactoring--improving code quality while preserving its semantics--has been widely adopted in traditional software development. In this work, we introduce refactoring to Scratch. We define four new Scratch refactorings: Extract Custom Block, Extract Parent Sprite, Extract Constant, and Reduce Variable Scope. To automate the application of these refactorings, we enhance the Scratch programming environment with powerful program analysis and transformation routines. To evaluate the utility of these refactorings, we apply them to remove the code smells detected in a representative dataset of 448 Scratch projects. We also conduct a between-subjects user study with 24 participants to assess how our refactoring tools impact programmers. Our results show that refactoring improves the subjects' code quality metrics, while our refactoring tools help motivate programmers to improve code quality. 
    more » « less
  5. Conflict-free replicated data types (CRDTs) are a promising tool for designing scalable, coordination-free distributed systems. However, constructing correct CRDTs is difficult, posing a challenge for even seasoned developers. As a result, CRDT development is still largely the domain of academics, with new designs often awaiting peer review and a manual proof of correctness. In this paper, we present Katara, a program synthesis-based system that takes sequential data type implementations and automatically synthesizes verified CRDT designs from them. Key to this process is a new formal definition of CRDT correctness that combines a reference sequential type with a lightweight ordering constraint that resolves conflicts between non-commutative operations. Our process follows the tradition of work in verified lifting, including an encoding of correctness into SMT logic using synthesized inductive invariants and hand-crafted grammars for the CRDT state and runtime. Katara is able to automatically synthesize CRDTs for a wide variety of scenarios, from reproducing classic CRDTs to synthesizing novel designs based on specifications in existing literature. Crucially, our synthesized CRDTs are fully, automatically verified, eliminating entire classes of common errors and reducing the process of producing a new CRDT from a painstaking paper proof of correctness to a lightweight specification. 
    more » « less