Bug report reproduction is an important, but time-consuming task carried out during mobile app maintenance. To accelerate this task, current research has proposed automated reproduction techniques that rely on a guided dynamic exploration of the app to match bug report steps with UI events in a mobile app. However, these techniques struggle to find the correct match when the bug reports have missing or inaccurately described steps. To address these limitations, we propose a new bug report reproduction technique that uses an app’s UI model to perform a global search across all possible matches between steps and UI actions and identify the most likely match while accounting for the possibility of missing or inaccurate steps. To do this, our approach redefines the bug report reproduction process as a Markov model and finds the best paths through the model using a dynamic programming based technique. We conducted an empirical evaluation on 72 real-world bug reports. Our approach achieved a 94% reproduction rate on the total bug reports and a 93% reproduction rate on bug reports with missing steps, significantly outperforming the state-of-the-art approaches. Our approach was also more effective in finding the matches from the steps to UI events than the state-of-the-art approaches.
more »
« less
This content will become publicly available on June 19, 2026
Automated Recognition of Buggy Behaviors from Mobile Bug Reports
Bug report reproduction is a crucial but time-consuming task to be carried out during mobile app maintenance. To accelerate this process, researchers have developed automated techniques for reproducing mobile app bug reports. However, due to the lack of an effective mechanism to recognize different buggy behaviors described in the report, existing work is limited to reproducing crash bug reports, or requires developers to manually analyze execution traces to determine if a bug was successfully reproduced. To address this limitation, we introduce a novel technique to automatically identify and extract the buggy behavior from the bug report and detect it during the automated reproduction process. To accommodate various buggy behaviors of mobile app bugs, we conducted an empirical study and created a standardized representation for expressing the bug behavior identified from our study. Given a report, our approach first transforms the documented buggy behavior into this standardized representation, then matches it against real-time device and UI information during the reproduction to recognize the bug. Our empirical evaluation demonstrated that our approach achieved over 90% precision and recall in generating the standardized representation of buggy behaviors. It correctly identified bugs in 83% of the bug reports and enhanced existing reproduction techniques, allowing them to reproduce four times more bug reports.
more »
« less
- Award ID(s):
- 2211454
- PAR ID:
- 10624111
- Publisher / Repository:
- ACM Digital Library
- Date Published:
- Journal Name:
- Proceedings of the ACM on Software Engineering
- Volume:
- 2
- Issue:
- FSE
- ISSN:
- 2994-970X
- Page Range / eLocation ID:
- 2240 to 2263
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
One of the most important tasks related to managing bug reports is localizing the fault so that a fix can be applied. As such, prior work has aimed to automate this task of bug localization by formulating it as an information retrieval problem, where potentially buggy files are retrieved and ranked according to their textual similarity with a given bug report. However, there is often a notable semantic gap between the information contained in bug reports and identifiers or natural language contained within source code files. For user-facing software, there is currently a key source of information that could aid in bug localization, but has not been thoroughly investigated - information from the GUI. We investigate the hypothesis that, for end user-facing applications, connecting information in a bug report with information from the GUI, and using this to aid in retrieving potentially buggy files, can improve upon existing techniques for bug localization. To examine this phenomenon, we conduct a comprehensive empirical study that augments four baseline techniques for bug localization with GUI interaction information from a reproduction scenario to (i) filter out potentially irrelevant files, (ii) boost potentially relevant files, and (iii) reformulate text-retrieval queries. To carry out our study, we source the current largest dataset of fully-localized and reproducible real bugs for Android apps, with corresponding bug reports, consisting of 80 bug reports from 39 popular open-source apps. Our results illustrate that augmenting traditional techniques with GUI information leads to a marked increase in effectiveness across multiple metrics, including a relative increase in Hits@10 of 13-18%. Additionally, through further analysis, we find that our studied augmentations largely complement existing techniques.more » « less
-
Enterprise software updates depend on the interaction between user and developer organizations. This interaction becomes especially complex when a single developer organization writes software that services hundreds of different user organizations. Miscommunication during patching and deployment efforts lead to insecure or malfunctioning software installations. While developers oversee the code, the update process starts and ends outside their control. Since developer test suites may fail to capture buggy behavior finding and fixing these bugs starts with user generated bug reports and 3rd party disclosures. The process ends when the fixed code is deployed in production. Any friction between user, and developer results in a delay patching critical bugs. Two common causes for friction are a failure to replicate user specific circumstances that cause buggy behavior and incompatible software releases that break critical functionality. Existing test generation techniques are insufficient. They fail to test candidate patches for post-deployment bugs and to test whether the new release adversely effects customer workloads. With existing test generation and deployment techniques, users can't choose (nor validate) compatible portions of new versions and retain their previous version's functionality. We present two new technologies to alleviate this friction. First, Test Generation for Ad Hoc Circumstances transforms buggy executions into test cases. Second, Binary Patch Decomposition allows users to select the compatible pieces of update releases. By sharing specific context around buggy behavior and developers can create specific test cases that demonstrate if their fixes are appropriate. When fixes are distributed by including extra context users can incorporate only updates that guarantee compatibility between buggy and fixed versions. We use change analysis in combination with binary rewriting to transform the old executable and buggy execution into a test case including the developer's prospective changes that let us generate and run targeted tests for the candidate patch. We also provide analogous support to users, to selectively validate and patch their production environments with only the desired bug-fixes from new version releases. This paper presents a new patching workflow that allows developers to validate prospective patches and users to select which updates they would like to apply, along with two new technologies that make it possible. We demonstrate our technique constructs tests cases more effectively and more efficiently than traditional test case generation on a collection of real world bugs compared to traditional test generation techniques, and provides the ability for flexible updates in real world scenarios.more » « less
-
The large demand of mobile devices creates significant concerns about the quality of mobile applications (apps). Developers heavily rely on bug reports in issue tracking systems to reproduce failures (e.g., crashes). However, the process of crash reproduction is often manually done by developers, making the resolution of bugs inefficient, especially given that bug reports are often written in natural language. To improve the productivity of developers in resolving bug reports, in this paper, we introduce a novel approach, called ReCDroid+, that can automatically reproduce crashes from bug reports for Android apps. ReCDroid+ uses a combination of natural language processing (NLP) , deep learning, and dynamic GUI exploration to synthesize event sequences with the goal of reproducing the reported crash. We have evaluated ReCDroid+ on 66 original bug reports from 37 Android apps. The results show that ReCDroid+ successfully reproduced 42 crashes (63.6% success rate) directly from the textual description of the manually reproduced bug reports. A user study involving 12 participants demonstrates that ReCDroid+ can improve the productivity of developers when resolving crash bug reports.more » « less
-
Gonnord, Laure; Titolo, Laura (Ed.)Just-in-Time (JIT) compilers are widely used to improve the performance of interpreter-based language implementations by creating optimized code at runtime. However, bugs in the JIT compiler’s code manipulation and optimization can result in the generation of incorrect code. Such bugs can be difficult to diagnose and fix, and can result in exploitable vulnerabilities. Unfortunately, existing approaches to automatic bug localization do not carry over well to such bugs. This paper discusses a different approach to analyzing JIT compiler optimization behaviors, based on using dynamic analysis to construct abstract models of the JIT compiler’s optimizer and back end. By comparing the models obtained for buggy and non-buggy executions of the JIT compiler, we can pinpoint the components of the JIT compiler’s internal representation that have been affected by the bug; this can then be mapped back to identify the buggy code. Our ex- periments with two real bugs for Google V8 JIT compiler, TurboFan, show the utility and practicality of our approach.more » « less
An official website of the United States government
