Debugging is a critical but challenging task for programmers. This paper proposes ChatDBG, an AI-powered debugging assistant. ChatDBG integrates large language models (LLMs) to significantly enhance the capabilities and user-friendliness of conventional debuggers. ChatDBG lets programmers engage in a collaborative dialogue with the debugger, allowing them to pose complex questions about program state, perform root cause analysis for crashes or assertion failures, and explore open-ended queries like why is x null?. To handle these queries, ChatDBG grants the LLM autonomy to take the wheel: it can act as an independent agent capable of querying and controlling the debugger to navigate through stacks and inspect program state. It then reports its findings and yields back control to the programmer. By leveraging the real-world knowledge embedded in LLMs, ChatDBG can diagnose issues identifiable only through the use of domain-specific reasoning. Our ChatDBG prototype integrates with standard debuggers including LLDB and GDB for native code and Pdb for Python. Our evaluation across a diverse set of code, including C/C++ code with known bugs and a suite of Python code including standalone scripts and Jupyter notebooks, demonstrates that ChatDBG can successfully analyze root causes, explain bugs, and generate accurate fixes for a wide range of real-world errors. For the Python programs, a single query led to an actionable bug fix 67% of the time; one additional follow-up query increased the success rate to 85%. ChatDBG has seen rapid uptake; it has already been downloaded more than 75,000 times.
more »
« less
Understanding and Supporting Debugging Workflows in Multiverse Analysis
Multiverse analysis—a paradigm for statistical analysis that considers all combinations of reasonable analysis choices in parallel—promises to improve transparency and reproducibility. Although recent tools help analysts specify multiverse analyses, they remain difficult to use in practice. In this work, we identify debugging as a key barrier due to the latency from running analyses to detecting bugs and the scale of metadata processing needed to diagnose a bug. To address these challenges, we prototype a command-line interface tool, Multiverse Debugger, which helps diagnose bugs in the multiverse and propagate fixes. In a qualitative lab study (n=13), we use Multiverse Debugger as a probe to develop a model of debugging workflows and identify specific challenges, including difficulty in understanding the multiverse’s composition. We conclude with design implications for future multiverse analysis authoring systems.
more »
« less
- Award ID(s):
- 1901386
- PAR ID:
- 10435880
- Date Published:
- Journal Name:
- CHI '23: Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems
- Page Range / eLocation ID:
- 1 to 19
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Multiverse analyses involve conducting all combinations of reasonable choices in a data analysis process. A reader of a study containing a multiverse analysis might question—are all the choices included in the multiverse reasonable and equally justifiable? How much do results vary if we make different choices in the analysis process? In this work, we identify principles for validating the composition of, and interpreting the uncertainty in, the results of a multiverse analysis. We present Milliways, a novel interactive visualisation system to support principled evaluation of multiverse analyses. Milliways provides interlinked panels presenting result distributions, individual analysis composition, multiverse code specification, and data summaries. Milliways supports interactions to sort, filter and aggregate results based on the analysis specification to identify decisions in the analysis process to which the results are sensitive. To represent the two qualitatively different types of uncertainty that arise in multiverse analyses—probabilistic uncertainty from estimating unknown quantities of interest such as regression coefficients, and possibilistic uncertainty from choices in the data analysis—Milliways uses consonance curves and probability boxes. Through an evaluative study with five users familiar with multiverse analysis, we demonstrate how Milliways can support multiverse analysis tasks, including a principled assessment of the results of a multiverse analysis.more » « less
-
Today’s STEM classrooms have expanded the domain of computer science education from a basic two-toned terminal screen to now include helpful Integrated Development Environments(IDE) (BlueJ, Eclipse), block-based programming (MIT Scratch, Greenfoot), and even physical computing with embedded systems (Arduino, LEGO Mindstorm). But no matter which environment a student starts programming in, all students will eventually need help in finding and fixing bugs in their code. While the helpful IDE’s have debugger tools built in (breakpoints for pausing your program, ways to view/modify variable values, and "stepping" through code execution), in many of the other programming environments, students are limited to using print statements to try and "see" what is happening inside their program. Most students who learn to write code for Arduino microcontrollers will start within the Arduino IDE, but the official Arduino IDE does not currently provide any debugging tools. Instead, a student would have to move on to a professional IDE such as Atmel Studio or acquire a hardware debugger in order to add breakpoints or view their program’s variables. But each of these options has a steep learning curve, additional costs, and can require complex configurations. Based on research of student debugging practices[3, 7] and our own classroom observations, we have developed an Arduino software library, called Arduino Debugger, which provides some of these debugging tools (ex. breakpoints) while staying within the official Arduino IDE. This work continues a previous library, (redacted), which focused on features specific to e-textiles development boards. The Arduino Debugger library has been modified to support not only e-textile boards (Lilypad, Adafruit Circuit Playground) but most AVR and ARM based Arduino boards.We are also in the process of testing a set of Debugging Code Templates to see how they might increase student adoption of debugging tools.more » « less
-
Deep neural networks (DNNs) are increasingly used in critical applications like autonomous vehicles and medical diagnosis, where accuracy and reliability are crucial. However, debugging DNNs is challenging and expensive, often leading to unpredictable behavior and performance issues. Identifying and diagnosing bugs in DNNs is difficult due to complex and obscure failure symptoms, which are data-driven and compute-intensive. To address this, we propose TransBug a framework that combines transformer models for feature extraction with deep learning models for classification to detect and diagnose bugs in DNNs. We employ a pre-trained transformer model, which has been trained in programming languages, to extract semantic features from both faulty and correct DNN models. We then use these extracted features in a separate deep-learning model to determine whether the code contains bugs. If a bug is detected, the model further classifies the type of bug. By leveraging the powerful feature extraction capabilities of transformers, we capture relevant characteristics from the code, which are then used by a deep learning model to identify and classify various types of bugs. This combination of transformer-based feature extraction and deep learning classification allows our method to accurately link bug symptoms to their causes, enabling developers to take targeted corrective actions. Empirical results show that the TransBug shows an accuracy of 81% for binary classification and 91% for classifying bug types.more » « less
-
Many big data systems are written in languages such as C, C++, Java, and Scala to process large amounts of data efficiently, while data analysts often use Python to conduct data wrangling, statistical analysis, and machine learning. User-defined functions (UDFs) are commonly used in these systems to bridge the gap between the two ecosystems. In this paper, we propose Udon, a novel debugger to support fine-grained debugging of UDFs. Udon encapsulates the modern line-by-line debugging primitives, such as the ability to set breakpoints, perform code inspections, and make code modifications while executing a UDF on a single tuple. It includes a novel debug-aware UDF execution model to ensure the responsiveness of the operator during debugging. It utilizes advanced state-transfer techniques to satisfy breakpoint conditions that span across multiple UDFs. It incorporates various optimization techniques to reduce the runtime overhead. We conduct experiments with multiple UDF workloads on various datasets and show its high efficiency and scalability.more » « less
An official website of the United States government

