skip to main content


Search for: All records

Creators/Authors contains: "Rodriguez, Alberto D"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Software documentation supports a broad set of software maintenance tasks; however, creating and maintaining high-quality, multi-level software documentation can be incredibly time-consuming and therefore many code bases suffer from a lack of adequate documentation. We address this problem through presenting HGEN, a fully automated pipeline that leverages LLMs to transform source code through a series of six stages into a well-organized hierarchy of formatted documents. We evaluate HGEN both quantitatively and qualitatively. First, we use it to generate documentation for three diverse projects, and engage key developers in comparing the quality of the generated documentation against their own previously produced manually-crafted documentation. We then pilot HGEN in nine different industrial projects using diverse datasets provided by each project. We collect feedback from project stakeholders, and analyze it using an inductive approach to identify recurring themes. Results show that HGEN produces artifact hierarchies similar in quality to manually constructed documentation, with much higher coverage of the core concepts than the baseline approach. Stakeholder feedback highlights HGEN's commercial impact potential as a tool for accelerating code comprehension and maintenance tasks. Results and associated supplemental materials can be found at https://zenodo.org/records/11403244. 
    more » « less
    Free, publicly-accessible full text available October 6, 2025
  2. Software engineering practices such as constructing requirements and establishing traceability help ensure systems are safe, reliable, and maintainable. However, they can be resource-intensive and are frequently underutilized. To alleviate the burden of these essential processes, we developed the Requirements Organization and Optimization Tool (ROOT). ROOT centralizes project information and offers project visualizations and AI-based tools designed to streamline engineering processes. With ROOT's assistance, engineers benefit from improved oversight and early error detection, leading to the successful development of software systems. A link to a screen cast can be found at: https://youtu.be/3rtMYRnsu24 
    more » « less
    Free, publicly-accessible full text available October 6, 2025
  3. IEEE Requirements Engineering Conference (Ed.)
    Large Language Models (LLMs) have the potential to revolutionize automated traceability by overcoming the challenges faced by previous methods and introducing new possibilities. However, the optimal utilization of LLMs for automated traceability remains unclear. This paper explores the process of prompt engineering to extract link predictions from an LLM. We provide detailed insights into our approach for constructing effective prompts, offering our lessons learned. Additionally, we propose multiple strategies for leveraging LLMs to generate traceability links, improving upon previous zero-shot methods on the ranking of candidate links after prompt refinement. The primary objective of this paper is to inspire and assist future researchers and engineers by highlighting the process of constructing traceability prompts to effectively harness LLMs for advancing automatic traceability. 
    more » « less
  4. Many organizations seek to increase their agility in order to deliver more timely and competitive products. However, in safety-critical systems such as medical devices, autonomous vehicles, or factory floor robots, the release of new features has the potential to introduce hazards that potentially lead to run-time failures that impact software safety. As a result, many projects suffer from a phenomenon referred to as the big freeze. SAFA is designed to address this challenge. Through the use of cutting-edge deep-learning solutions, it generates trees of requirements, designs, code, tests, and other artifacts that visually depict how hazards are mitigated in the system, and it automatically warns the user when key artifacts are missing. It also uses a combination of colors, annotations, and recommendations to dynamically visualize change across software versions and augments safety cases with visual annotations to aid users in detecting and analyzing potentially adverse impacts of change upon system safety. A link to our tool demo can be found at https://www.youtube.com/watch?v=r-CwxerbSVA. 
    more » « less
  5. Software traceability establishes a network of connections between diverse artifacts such as requirements, design, and code. However, given the cost and effort of creating and maintaining trace links manually, researchers have proposed automated approaches using information retrieval techniques. Current approaches focus almost entirely upon generating links between pairs of artifacts and have not leveraged the broader network of interconnected artifacts. In this paper we investigate the use of intermediate artifacts to enhance the accuracy of the generated trace links - focusing on paths consisting of source, target, and intermediate artifacts. We propose and evaluate combinations of techniques for computing semantic similarity, scaling scores across multiple paths, and aggregating results from multiple paths. We report results from five projects, including one large industrial project. We find that leveraging intermediate artifacts improves the accuracy of end-to-end trace retrieval across all datasets and accuracy metrics. After further analysis, we discover that leveraging intermediate artifacts is only helpful when a project's artifacts share a common vocabulary, which tends to occur in refinement and decomposition hierarchies of artifacts. Given our hybrid approach that integrates both direct and transitive links, we observed little to no loss of accuracy when intermediate artifacts lacked a shared vocabulary with source or target artifacts. 
    more » « less