BatFix: Repairing language model-based transpilation

Ramos, Daniel; Lynce, Inês; Manquinho, Vasco; Martins, Ruben; Le_Goues, Claire

doi:10.1145/3658668

Citation Details

BatFix: Repairing language model-based transpilation

To keep up with changes in requirements, frameworks, and coding practices, software organizations might need to migrate code from one language to another. Source-to-source migration, or transpilation, is often a complex, manual process. Transpilation requires expertise both in the source and target language, making it highly laborious and costly. Languages models for code generation and transpilation are becoming increasingly popular. However, despite capturing code-structure well, code generated by language models is often spurious and contains subtle problems. We proposeBatFix, a novel approach that augments language models for transpilation by leveraging program repair and synthesis to fix the code generated by these models.BatFixtakes as input both the original program, the target program generated by the machine translation model, and a set of test cases and outputs a repaired program that passes all test cases. Experimental results show that our approach is agnostic to language models and programming languages.BatFixcan locate bugs spawning multiple lines and synthesize patches for syntax and semantic bugs for programs migrated fromJavatoC++andPythontoC++from multiple language models, including, OpenAI’sCodex. more »

Award ID(s):: 1750116

PAR ID:: 10568538

Author(s) / Creator(s):: Ramos, Daniel; Lynce, Inês; Manquinho, Vasco; Martins, Ruben; Le_Goues, Claire

Publisher / Repository:: ACM

Date Published:: 2024-07-31

Journal Name:: ACM Transactions on Software Engineering and Methodology

Volume:: 33

Issue:: 6

ISSN:: 1049-331X

Page Range / eLocation ID:: 1 to 29

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Journal Article:
https://doi.org/10.1145/3658668

More Like this