Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Software documentation supports a broad set of software maintenance tasks; however, creating and maintaining high-quality, multi-level software documentation can be incredibly time-consuming and therefore many code bases suffer from a lack of adequate documentation. We address this problem through presenting HGEN, a fully automated pipeline that leverages LLMs to transform source code through a series of six stages into a well-organized hierarchy of formatted documents. We evaluate HGEN both quantitatively and qualitatively. First, we use it to generate documentation for three diverse projects, and engage key developers in comparing the quality of the generated documentation against their own previously produced manually-crafted documentation. We then pilot HGEN in nine different industrial projects using diverse datasets provided by each project. We collect feedback from project stakeholders, and analyze it using an inductive approach to identify recurring themes. Results show that HGEN produces artifact hierarchies similar in quality to manually constructed documentation, with much higher coverage of the core concepts than the baseline approach. Stakeholder feedback highlights HGEN's commercial impact potential as a tool for accelerating code comprehension and maintenance tasks. Results and associated supplemental materials can be found at https://zenodo.org/records/11403244.more » « lessFree, publicly-accessible full text available October 6, 2025
-
IEEE Requirements Engineering Conference (Ed.)Large Language Models (LLMs) have the potential to revolutionize automated traceability by overcoming the challenges faced by previous methods and introducing new possibilities. However, the optimal utilization of LLMs for automated traceability remains unclear. This paper explores the process of prompt engineering to extract link predictions from an LLM. We provide detailed insights into our approach for constructing effective prompts, offering our lessons learned. Additionally, we propose multiple strategies for leveraging LLMs to generate traceability links, improving upon previous zero-shot methods on the ranking of candidate links after prompt refinement. The primary objective of this paper is to inspire and assist future researchers and engineers by highlighting the process of constructing traceability prompts to effectively harness LLMs for advancing automatic traceability.more » « less
-
Many organizations seek to increase their agility in order to deliver more timely and competitive products. However, in safety-critical systems such as medical devices, autonomous vehicles, or factory floor robots, the release of new features has the potential to introduce hazards that potentially lead to run-time failures that impact software safety. As a result, many projects suffer from a phenomenon referred to as the big freeze. SAFA is designed to address this challenge. Through the use of cutting-edge deep-learning solutions, it generates trees of requirements, designs, code, tests, and other artifacts that visually depict how hazards are mitigated in the system, and it automatically warns the user when key artifacts are missing. It also uses a combination of colors, annotations, and recommendations to dynamically visualize change across software versions and augments safety cases with visual annotations to aid users in detecting and analyzing potentially adverse impacts of change upon system safety. A link to our tool demo can be found at https://www.youtube.com/watch?v=r-CwxerbSVA.more » « less
-
Recent breakthroughs in deep-learning (DL) approaches have resulted in the dynamic generation of trace links that are far more accurate than was previously possible. However, DL-generated links lack clear explanations, and therefore non-experts in the domain can find it difficult to understand the underlying semantics of the link, making it hard for them to evaluate the link's correctness or suitability for a specific software engineering task. In this paper we present a novel NLP pipeline for generating and visualizing trace link explanations. Our approach identifies domain-specific concepts, retrieves a corpus of concept-related sentences, mines concept definitions and usage examples, and identifies relations between cross-artifact concepts in order to explain the links. It applies a post-processing step to prioritize the most likely acronyms and definitions and to eliminate non-relevant ones. We evaluate our approach using project artifacts from three different domains of interstellar telescopes, positive train control, and electronic healthcare systems, and then report coverage, correctness, and potential utility of the generated definitions. We design and utilize an explanation interface which leverages concept definitions and relations to visualize and explain trace link rationales, and we report results from a user study that was conducted to evaluate the effectiveness of the explanation interface. Results show that the explanations presented in the interface helped non-experts to understand the underlying semantics of a trace link and improved their ability to vet the correctness of the link.more » « less
-
null (Ed.)With the rise of new AI technologies, autonomous systems are moving towards a paradigm in which increasing levels of responsibility are shifted from the human to the system, creating a transition from human-in-the-loop systems to human-on-the-loop (HoTL) systems. This has a significant impact on the safety analysis of such systems, as new types of errors occurring at the boundaries of human-machine interactions need to be taken into consideration. Traditional safety analysis typically focuses on system-level hazards with little focus on user-related or user-induced hazards that can cause critical system failures. To address this issue, we construct domain-level safety analysis assets for sUAS (small unmanned aerial systems) applications and describe the process we followed to explicitly, and systematically identify Human Interaction Points (HiPs), Hazard Factors and Mitigations from system hazards. We evaluate our approach by first investigating the extent to which recent sUAS incidents are covered by our hazard trees, and second by performing a study with six domain experts using our hazard trees to identify and document hazards for sUAS usage scenarios. Our study showed that our hazard trees provided effective coverage for a wide variety of sUAS application scenarios and were useful for stimulating safety thinking and helping users to identify and potentially mitigate human-interaction hazards.more » « less
-
null (Ed.)Regulations, standards, and guidelines for safety-critical systems stipulate stringent traceability but do not prescribe the corresponding, detailed software engineering process. Given the industrial practice of using only semi-formal notations to describe engineering processes, processes are rarely ``executable'' and developers have to spend significant manual effort in ensuring that they follow the steps mandated by quality assurance. The size and complexity of systems and regulations makes manual, timely feedback from Quality Assurance (QA) engineers infeasible. In this paper we propose a novel framework for tracking processes in the background, automatically checking QA constraints depending on process progress, and informing the developer of unfulfilled QA constraints. We evaluate our approach by applying it to two different case studies; one open source community system and a safety-critical system in the air-traffic control domain. Results from the analysis show that trace links are often corrected or completed after the fact and thus timely and automated constraint checking support has significant potential on reducing rework.more » « less