Toward Optimal Selection of Information Retrieval Models for Software Engineering Tasks

Rahman, Md Masudur; Chakraborty, Saikat; Kaiser, Gail; Ray, Baishakhi

Citation Details

Information Retrieval (IR) plays a pivotal role indiverse Software Engineering (SE) tasks, e.g., bug localization and triaging, bug report routing, code retrieval, requirements analysis, etc. SE tasks operate on diverse types of documents including code, text, stack-traces, and structured, semi-structured and unstructured meta-data that often contain specialized vocabularies. As the performance of any IR-based tool critically depends on the underlying document types, and given the diversity of SE corpora, it is essential to understand which models work best for which types of SE documents and tasks.We empirically investigate the interaction between IR models and document types for two representative SE tasks (bug localization and relevant project search), carefully chosen as they require a diverse set of SE artifacts (mixtures of code and text),and confirm that the models’ performance varies significantly with mix of document types. Leveraging this insight, we propose a generalized framework, SRCH, to automatically select the most favorable IR model(s) for a given SE task. We evaluate SRCH w.r.t. these two tasks and confirm its effectiveness. Our preliminary user study shows that SRCH’s intelligent adaption of the IR model(s) to the task at hand not only improves precision and recall for SE tasks but may also improve users’ satisfaction. more »

Award ID(s):: 1815494 1842456 1563555

PAR ID:: 10113741

Author(s) / Creator(s):: Rahman, Md Masudur; Chakraborty, Saikat; Kaiser, Gail; Ray, Baishakhi

Date Published:: 2019-09-30

Journal Name:: 19th IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM)

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.

More Like this