NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

An LLM-Based Agent-Oriented Approach for Automated Code Design Issue Localization

https://doi.org/10.1109/ICSE55347.2025.00100

Batole, Fraol; OBrien, David; Nguyen, Tien N; Dyer, Robert; Rajan, Hridesh (April 2025, IEEE)

Free, publicly-accessible full text available April 26, 2026
Data-Driven Evidence-Based Syntactic Sugar Design

OBrien, David; Dyer, Robert; Nguyen, Tien; Rajan, Hridesh (April 2024, Association for Computing Machinery)

Programming languages are essential tools for developers, and their evolution plays a crucial role in supporting the activities of developers. One instance of programming language evolution is the introduction of syntactic sugars, which are additional syntax elements that provide alternative, more readable code constructs. However, the process of designing and evolving a programming language has traditionally been guided by anecdotal experiences and intuition. Recent advances in tools and methodologies for mining open-source repositories have enabled developers to make data-driven software engineering decisions. In light of this, this paper proposes an approach for motivating data-driven programming evolution by applying frequent subgraph mining techniques to a large dataset of 166,827,154 open-source Java methods. The dataset is mined by generalizing Java control-flow graphs to capture broad programming language usages and instances of duplication. Frequent subgraphs are then extracted to identify potentially impactful opportunities for new syntactic sugars. Our diverse results demonstrate the benefits of the proposed technique by identifying new syntactic sugars involving a variety of programming constructs that could be implemented in Java, thus simplifying frequent code idioms. This approach can potentially provide valuable insights for Java language designers, and serve as a proof-of-concept for data-driven programming language design and evolution.
more » « less
Full Text Available
Are Prompt Engineering and TODO Comments Friends or Foes? An Evaluation on GitHub Copilot

OBrien, David; Biswas, Sumon; Imtiaz, Sayem; Abdalkareem, Rabe; Shihab, Emad; Rajan, Hridesh (April 2024, Association for Computing Machinery)

Code intelligence tools such as GitHub Copilot have begun to bridge the gap between natural language and programming language. A frequent software development task is the management of technical debts, which are suboptimal solutions or unaddressed issues which hinder future software development. Developers have been found to ``self-admit'' technical debts (SATD) in software artifacts such as source code comments. Thus, is it possible that the information present in these comments can enhance code generative prompts to repay the described SATD? Or, does the inclusion of such comments instead cause code generative tools to reproduce the harmful symptoms of described technical debt? Does the modification of SATD impact this reaction? Despite the heavy maintenance costs caused by technical debt and the recent improvements of code intelligence tools, no prior works have sought to incorporate SATD towards prompt engineering. Inspired by this, this paper contributes and analyzes a dataset consisting of 36,381 TODO comments in the latest available revisions of their respective 102,424 repositories, from which we sample and manually generate 1,140 code bodies using GitHub Copilot. Our experiments show that GitHub Copilot can generate code with the symptoms of SATD, both prompted and unprompted. Moreover, we demonstrate the tool's ability to automatically repay SATD under different circumstances and qualitatively investigate the characteristics of successful and unsuccessful comments. Finally, we discuss gaps in which GitHub Copilot's successors and future researchers can improve upon code intelligence tasks to facilitate AI-assisted software maintenance.
more » « less
Full Text Available
Are Prompt Engineering and TODO Comments Friends or Foes? An Evaluation on GitHub Copilot

https://doi.org/10.1145/3597503.3639176

OBrien, David; Biswas, Sumon; Imtiaz, Sayem Mohammad; Abdalkareem, Rabe; Shihab, Emad; Rajan, Hridesh (April 2024, ACM)

Search for: All records