Identifying Functionally Similar Code in Complex Codebases

Su, Fang-Hsiang; Bell, Jonathan; Kaiser, Gail; Sethumadhavan, Simha

Citation Details

Identifying similar code in software systems can assist many software engineering tasks such as program understanding and software refactoring. While most approaches focus on identifying code that looks alike, some techniques aim at detecting code that functions alike. Detecting these functional clones - code that functions alike - in object oriented languages remains an open question because of the difficulty in exposing and comparing programs' functionality effectively. We propose a novel technique, In-Vivo Clone Detection, that detects functional clones in arbitrary programs by identifying and mining their inputs and outputs. The key insight is to use existing workloads to execute programs and then measure functional similarities between programs based on their inputs and outputs, which mitigates the problems in object oriented languages reported by prior work. We implement such technique in our system, HitoshiIO, which is open source and freely available. Our experimental results show that HitoshiIO detects more than 800 functional clones across a corpus of 118 projects. In a random sample of the detected clones, HitoshiIO achieves 68+% true positive rate with only 15% false positive rate. more »

Award ID(s):: 1302269 1161079 0905246

PAR ID:: 10112156

Author(s) / Creator(s):: Su, Fang-Hsiang; Bell, Jonathan; Kaiser, Gail; Sethumadhavan, Simha

Date Published:: 2016-05-16

Journal Name:: 24th IEEE International Conference on Program Comprehension (ICPC)

Page Range / eLocation ID:: 1 - 10

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript
Conference Paper:
The DOI is not currently available.

More Like this