NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Supporting Software Maintenance with Dynamically Generated Document Hierarchies

https://doi.org/10.1109/ICSME58944.2024.00046

Dearstyne, Katherine R; Rodriguez, Alberto D; Cleland-Huang, Jane (October 2024, IEEE)

Software documentation supports a broad set of software maintenance tasks; however, creating and maintaining high-quality, multi-level software documentation can be incredibly time-consuming and therefore many code bases suffer from a lack of adequate documentation. We address this problem through presenting HGEN, a fully automated pipeline that leverages LLMs to transform source code through a series of six stages into a well-organized hierarchy of formatted documents. We evaluate HGEN both quantitatively and qualitatively. First, we use it to generate documentation for three diverse projects, and engage key developers in comparing the quality of the generated documentation against their own previously produced manually-crafted documentation. We then pilot HGEN in nine different industrial projects using diverse datasets provided by each project. We collect feedback from project stakeholders, and analyze it using an inductive approach to identify recurring themes. Results show that HGEN produces artifact hierarchies similar in quality to manually constructed documentation, with much higher coverage of the core concepts than the baseline approach. Stakeholder feedback highlights HGEN's commercial impact potential as a tool for accelerating code comprehension and maintenance tasks.
more » « less
Full Text Available
ROOT: Requirements Organization and Optimization Tool

https://doi.org/10.1109/ICSME58944.2024.00096

Dearstyne, Katherine R; Rodriguez, Alberto D; Cleland-Huang, Jane (October 2024, IEEE)

Software engineering practices such as constructing requirements and establishing traceability help ensure systems are safe, reliable, and maintainable. However, they can be resource-intensive and are frequently underutilized. To alleviate the burden of these essential processes, we developed the Requirements Organization and Optimization Tool (ROOT). ROOT centralizes project information and offers project visualizations and AI-based tools designed to streamline engineering processes. With ROOT's assistance, engineers benefit from improved oversight and early error detection, leading to the successful development of software systems.
more » « less
Full Text Available
ROOT: Requirements Organization and Optimization Tool

Dearstyne, Katherine R; Rodriguez, Alberto D; Cleland-Huang, Jane (October 2024, IEEE)

Software engineering practices such as constructing requirements and establishing traceability help ensure systems are safe, reliable, and maintainable. However, they can be resource-intensive and are frequently underutilized. To alleviate the burden of these essential processes, we developed the Requirements Organization and Optimization Tool (ROOT). ROOT centralizes project information and offers project visualizations and AI-based tools designed to streamline engineering processes. With ROOT's assistance, engineers benefit from improved oversight and early error detection, leading to the successful development of software systems. A link to a screen cast can be found at: https://youtu.be/3rtMYRnsu24
more » « less
Full Text Available
Supporting Software Maintenance with Dynamically Generated Document Hierarchies

Dearstyne, Katherine R; Rodriguez, Alberto D; Cleland-Huang, Jane (October 2024, IEEE)

Software documentation supports a broad set of software maintenance tasks; however, creating and maintaining high-quality, multi-level software documentation can be incredibly time-consuming and therefore many code bases suffer from a lack of adequate documentation. We address this problem through presenting HGEN, a fully automated pipeline that leverages LLMs to transform source code through a series of six stages into a well-organized hierarchy of formatted documents. We evaluate HGEN both quantitatively and qualitatively. First, we use it to generate documentation for three diverse projects, and engage key developers in comparing the quality of the generated documentation against their own previously produced manually-crafted documentation. We then pilot HGEN in nine different industrial projects using diverse datasets provided by each project. We collect feedback from project stakeholders, and analyze it using an inductive approach to identify recurring themes. Results show that HGEN produces artifact hierarchies similar in quality to manually constructed documentation, with much higher coverage of the core concepts than the baseline approach. Stakeholder feedback highlights HGEN's commercial impact potential as a tool for accelerating code comprehension and maintenance tasks. Results and associated supplemental materials can be found at https://zenodo.org/records/11403244.
more » « less
Full Text Available
Prompts Matter: Insights and Strategies for Prompt Engineering in Automated Software Traceability

https://doi.org/10.1109/REW57809.2023.00087

Rodriguez, Alberto D.; Dearstyne, Katherine R.; Cleland-Huang, Jane (September 2023, IEEE)
IEEE Requirements Engineering Conference (Ed.)
Large Language Models (LLMs) have the potential to revolutionize automated traceability by overcoming the challenges faced by previous methods and introducing new possibilities. However, the optimal utilization of LLMs for automated traceability remains unclear. This paper explores the process of prompt engineering to extract link predictions from an LLM. We provide detailed insights into our approach for constructing effective prompts, offering our lessons learned. Additionally, we propose multiple strategies for leveraging LLMs to generate traceability links, improving upon previous zero-shot methods on the ranking of candidate links after prompt refinement. The primary objective of this paper is to inspire and assist future researchers and engineers by highlighting the process of constructing traceability prompts to effectively harness LLMs for advancing automatic traceability.
more » « less
Full Text Available
SAFA: A Tool for Supporting Safety Analysis in Evolving Software Systems

https://doi.org/10.1145/3551349.3559535

Rodriguez, Alberto D.; Newman, Timothy; Dearstyne, Katherine R.; Cleland-Huang, Jane (October 2022, 37th {IEEE/ACM} International Conference on Automated Software Engineering, {ASE} 2022, Rochester, MI, USA, October 10-14, 2022)

Many organizations seek to increase their agility in order to deliver more timely and competitive products. However, in safety-critical systems such as medical devices, autonomous vehicles, or factory floor robots, the release of new features has the potential to introduce hazards that potentially lead to run-time failures that impact software safety. As a result, many projects suffer from a phenomenon referred to as the big freeze. SAFA is designed to address this challenge. Through the use of cutting-edge deep-learning solutions, it generates trees of requirements, designs, code, tests, and other artifacts that visually depict how hazards are mitigated in the system, and it automatically warns the user when key artifacts are missing. It also uses a combination of colors, annotations, and recommendations to dynamically visualize change across software versions and augments safety cases with visual annotations to aid users in detecting and analyzing potentially adverse impacts of change upon system safety. A link to our tool demo can be found at https://www.youtube.com/watch?v=r-CwxerbSVA.
more » « less
Leveraging Intermediate Artifacts to Improve Automated Trace Link Retrieval

https://doi.org/10.1109/ICSME52107.2021.00014

Rodriguez, Alberto D.; Cleland-Huang, Jane; Falessi, Davide (September 2021, IEEE International Conference on Software Maintenance and Evolution)

Software traceability establishes a network of connections between diverse artifacts such as requirements, design, and code. However, given the cost and effort of creating and maintaining trace links manually, researchers have proposed automated approaches using information retrieval techniques. Current approaches focus almost entirely upon generating links between pairs of artifacts and have not leveraged the broader network of interconnected artifacts. In this paper we investigate the use of intermediate artifacts to enhance the accuracy of the generated trace links - focusing on paths consisting of source, target, and intermediate artifacts. We propose and evaluate combinations of techniques for computing semantic similarity, scaling scores across multiple paths, and aggregating results from multiple paths. We report results from five projects, including one large industrial project. We find that leveraging intermediate artifacts improves the accuracy of end-to-end trace retrieval across all datasets and accuracy metrics. After further analysis, we discover that leveraging intermediate artifacts is only helpful when a project's artifacts share a common vocabulary, which tends to occur in refinement and decomposition hierarchies of artifacts. Given our hybrid approach that integrates both direct and transitive links, we observed little to no loss of accuracy when intermediate artifacts lacked a shared vocabulary with source or target artifacts.
more » « less
Full Text Available

Search for: All records