PYEVOLVE: Automating Frequent Code Changes in Python ML Systems

Dilhara, Malinda; Dig, Danny; Ketkar, Ameya

doi:10.1109/ICSE48619.2023.00091

Citation Details

PYEVOLVE: Automating Frequent Code Changes in Python ML Systems

Because of the naturalness of software and the rapid evolution of Machine Learning (ML) techniques, frequently repeated code change patterns (CPATs) occur often. They range from simple API migrations to changes involving several complex control structures such as for loops. While manually performing CPATs is tedious, the current state-of-the-art techniques for inferring transformation rules are not advanced enough to handle unseen variants of complex CPATs, resulting in a low recall rate. In this paper we present a novel, automated workflow that mines CPATs, infers the transformation rules, and then transplants them automatically to new target sites. We designed, implemented, evaluated and released this in a tool, PYEVOLVE. At its core is a novel data-flow, control-flow aware transformation rule inference engine. Our technique allows us to advance the state-of-the-art for transformation-by-example tools; without it, 70% of the code changes that PYEVOLVE transforms would not be possible to automate. Our thorough empirical evaluation of over 40,000 transformations shows 97% precision and 94% recall. By accepting 90% of CPATs generated by PYEVOLVE in famous open-source projects, developers confirmed its changes are useful. more »

Award ID(s):: 2213767

PAR ID:: 10471901

Author(s) / Creator(s):: Dilhara, Malinda; Dig, Danny; Ketkar, Ameya

Publisher / Repository:: IEEE

Date Published:: 2023-05-01

ISBN:: 978-1-6654-5701-9

Page Range / eLocation ID:: 995 to 1007

Format(s):: Medium: X

Location:: Melbourne, Australia

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.1109/ICSE48619.2023.00091

More Like this