NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

On Combining IR Methods to Improve Bug Localization

https://doi.org/10.1145/3387904.3389280

Khatiwada, Saket; Tushev, Miroslav; Mahmoud, Anas (July 2020, The 28th International Conference on Program Comprehension)
null (Ed.)
Information Retrieval (IR) methods have been recently employed to provide automatic support for bug localization tasks. However, for an IR-based bug localization tool to be useful, it has to achieve adequate retrieval accuracy. Lower precision and recall can leave developers with large amounts of incorrect information to wade through. To address this issue, in this paper, we systematically investigate the impact of combining various IR methods on the retrieval accuracy of bug localization engines. The main assumption is that different IR methods, targeting different dimensions of similarity between artifacts, can be used to enhance the confidence in each others' results. Five benchmark systems from different application domains are used to conduct our analysis. The results show that a) near-optimal global configurations can be determined for different combinations of IR methods, b) optimized IR-hybrids can significantly outperform individual methods as well as other unoptimized methods, and c) hybrid methods achieve their best performance when utilizing information-theoretic IR methods. Our findings can be used to enhance the practicality of IR-based bug localization tools and minimize the cognitive overload developers often face when locating bugs.
more » « less
Full Text Available
Linguistic Documentation of Software History

https://doi.org/10.1145/3387904.3389288

Tushev, M; Mahmoud, A. (July 2020, the 28th International Conference on Program Comprehension)
null (Ed.)
Open Source Software (OSS) projects start with an initial vocabulary, often determined by the first generation of developers. This vocabulary, embedded in code identifier names and internal code comments, goes through multiple rounds of change, influenced by the interrelated patterns of human (e.g., developers joining and departing) and system (e.g., maintenance activities) interactions. Capturing the dynamics of this change is crucial for understanding and synthesizing code changes over time. However, existing code evolution analysis tools, available in modern version control systems such as GitHub and SourceForge, often overlook the linguistic aspects of code evolution. To bridge this gap, in this paper, we propose to study code evolution in OSS projects through the lens of developers' language, also known as code lexicon. Our analysis is conducted using 32 OSS projects sampled from a broad range of application domains. Our results show that different maintenance activities impact code lexicon differently. These insights lay out a preliminary foundation for modeling the linguistic history of OSS projects. In the long run, this foundation will be utilized to provide support for basic program comprehension tasks and help researchers gain new insights into the complex interplay between linguistic change and various system and human aspects of OSS development.
more » « less
Full Text Available
Using GitHub in large software engineering classes. An exploratory case study

https://doi.org/10.1080/08993408.2019.1696168

Tushev, Miroslav; Williams, Grant; Mahmoud, Anas (December 2019, Computer Science Education)

Background and Context: GitHub has been recently used in Software Engineering (SE) classes to facilitate collaboration in student team projects as well as help teachers to evaluate the contributions of their students more objectively. Objective: We explore the benefits and drawbacks of using GitHub as a means for team collaboration and performance evaluation in large SE classes. Method: Our research method takes the form of a case study conducted in a senior level SE class with 91 students. Our study also includes entry and exit surveys, an exit interview, and a qualitative analysis of students’ commit behavior. Findings: Different teams adapt GitHub to their workflow differently. Furthermore, despite the steep learning curve, using GitHub should not affect the quality of students’ submissions. However, using GitHub metrics as a proxy for evaluating team performance can be risky. Implications: We provide several recommendations for integrating Web-based configuration management tools in SE classes.
more » « less
Full Text Available
Linguistic Change in Open Source Software

https://doi.org/10.1109/ICSME.2019.00045

Tushev, Miroslav; Khatiwada, Saket; Mahmoud, Anas (September 2019, 2019 IEEE International Conference on Software Maintenance and Evolution (ICSME))

In this paper, we seek to advance the state-of-the-art in code evolution analysis research and practice by statistically analyzing, interpreting, and formally describing the evolution of code lexicon in Open Source Software (OSS). The underlying hypothesis is that, similar to natural language, code lexicon falls under the remit of evolutionary principles. Therefore, adapting theories and statistical models of natural language evolution to code is expected to provide unique insights into software evolution. Our analysis in this paper is conducted using 2,000 OSS systems sampled from a broad range of application domains. Our results show that a) OSS projects exhibit a significant shift in their linguistic identity over time, b) different syntactic structures of code lexicon evolve differently, c) different factors of OSS development and different maintenance activities impact code lexicon differently. These insights lay out a preliminary foundation for modeling the linguistic history of OSS projects. In the long run, this foundation will be utilized to provide support for basic software maintenance and program comprehension activities, and gain new theoretical insights into the complex interplay between linguistic change and various system and human aspects of OSS development.
more » « less
Full Text Available

Search for: All records