skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: The untold impact of learning approaches on software fault-proneness predictions: an analysis of temporal aspects
This paper aims to improve software fault-proneness prediction by investigating the unex- plored effects on classification performance of the temporal decisions made by practitioners and researchers regarding (i) the interval for which they will collect longitudinal features (soft- ware metrics data), and (ii) the interval for which they will predict software bugs (the target variable). We call these specifics of the data used for training and of the target variable being predicted the learning approach, and explore the impact of the two most common learning approaches on the performance of software fault-proneness prediction, both within a single release of a software product and across releases. The paper presents empirical results from a study based on data extracted from 64 releases of twelve open-source projects. Results show that the learning approach has a substantial, and typically unacknowledged, impact on classi- fication performance. Specifically, we show that one learning approach leads to significantly better performance than the other, both within-release and across-releases. Furthermore, this paper uncovers that, for within-release predictions, the difference in classification perfor- mance is due to different levels of class imbalance in the two learning approaches. Our findings show that improved specification of the learning approach is essential to under- standing and explaining the performance of fault-proneness prediction models, as well as to avoiding misleading comparisons among them. The paper concludes with some practical recommendations and research directions based on our findings toward improved software fault-proneness prediction. This preprint has not undergone any post-submission improvements or corrections. The Version of Record of this article is published in Empirical Software Engineering, and is available online at https://doi.org/10.1007/s10664-024-10454-8.  more » « less
Award ID(s):
2211589
PAR ID:
10516090
Author(s) / Creator(s):
; ;
Publisher / Repository:
Springer Nature
Date Published:
Journal Name:
Empirical Software Engineering
Volume:
29
Issue:
4
ISSN:
1382-3256
Subject(s) / Keyword(s):
Software fault-proneness prediction Learning approach Within-release prediction Across-release prediction Machine learning Design of experiments
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Jensen, Anton; Lewis, Grace A. (Ed.)
    Architectural decay imposes real costs in terms of developer effort, system correctness, and performance. Over time, those problems are likely to be revealed as explicit implementation issues (defects, feature changes, etc.). Recent empirical studies have demonstrated that there is a significant correlation between architectural “smells”—manifestations of architectural decay—and implementation issues. In this paper, we take a step further in exploring this phenomenon. We analyze the available development data from 10 open-source software systems and show that information regarding current architectural decay in these systems can be used to build models that accurately predict future issue-proneness and change-proneness of the systems’ implementations. As a less intuitive result, we also show that, in cases where historical data for a system is unavailable, such data from other, unrelated systems can provide reasonably accurate issue- and change-proneness prediction capabilities. 
    more » « less
  2. Learning-based fault localization has been intensively studied recently. Prior studies have shown that traditional Learning-to-Rank techniques can help precisely diagnose fault locations using various dimensions of fault-diagnosis features, such as suspiciousness values computed by various off-the-shelf fault localization techniques. However, with the increasing dimensions of features considered by advanced fault localization techniques, it can be quite challenging for the traditional Learning-to-Rank algorithms to automatically identify effective existing/latent features. In this work, we propose DeepFL, a deep learning approach to automatically learn the most effective existing/latent features for precise fault localization. Although the approach is general, in this work, we collect various suspiciousness-value-based, fault-proneness-based and textual-similarity-based features from the fault localization, defect prediction and information retrieval areas, respectively. DeepFL has been studied on 395 real bugs from the widely used Defects4J benchmark. The experimental results show DeepFL can significantly outperform state-of-the-art TraPT/FLUCCS (e.g., localizing 50+ more faults within Top-1). We also investigate the impacts of deep model configurations (e.g., loss functions and epoch settings) and features. Furthermore, DeepFL is also surprisingly effective for cross-project prediction. 
    more » « less
  3. In recent years, there is a growing need to train machine learning models on a huge volume of data. Therefore, designing efficient distributed optimization algorithms for empirical risk minimization (ERM) has become an active and challenging research topic. In this paper, we propose a flexible framework for distributed ERM training through solving the dual problem, which provides a unified description and comparison of existing methods. Our approach requires only approximate solutions of the sub-problems involved in the optimization process, and is versatile to be applied on many large-scale machine learning problems including classification, regression, and structured prediction. We show that our framework enjoys global linear convergence for a broad class of non-strongly-convex problems, and some specific choices of the sub-problems can even achieve much faster convergence than existing approaches by a refined analysis. This improved convergence rate is also reflected in the superior empirical performance of our method. 
    more » « less
  4. null (Ed.)
    With the increased interest to incorporate machine learning into software and systems, methods to characterize the impact of the reliability of machine learning are needed to ensure the reliability of the software and systems in which these algorithms reside. Towards this end, we build upon the architecture-based approach to software reliability modeling, which represents application reliability in terms of the component reliabilities and the probabilistic transitions between the components. Traditional architecture-based software reliability models consider all components to be deterministic software. We therefore extend this modeling approach to the case, where some components represent learning enabled components. Here, the reliability of a machine learning component is interpreted as the accuracy of its decisions, which is a common measure of classification algorithms. Moreover, we allow these machine learning components to be fault-tolerant in the sense that multiple diverse classifier algorithms are trained to guide decisions and the majority decision taken. We demonstrate the utility of the approach to assess the impact of machine learning on software reliability as well as illustrate the concept of reliability growth in machine learning. Finally, we validate past analytical results for a fault tolerant system composed of correlated components with real machine learning algorithms and data, demonstrating the analytical expression’s ability to accurately estimate the reliability of the fault tolerant machine learning component and subsequently the architecture-based software within which it resides. 
    more » « less
  5. In coastal river systems, floods, often during major storms or king tides, severely threaten lives and property. However, hydraulic structures such as dams, gates, pumps, and reservoirs exist in these river systems, and these floods can be mitigated or even prevented by strategically releasing water before extreme weather events. A standard approach used by local water management agencies is the “rule-based” method, which specifies predetermined water prereleases based on historical human experience, but which tends to result in excessive or inadequate water release. Iterative optimization methods that rely on detailed physics-based models for prediction are an alternative approach. Whereas, such methods tend to be computationally intensive, requiring hours or even days to solve the problem optimally. In this paper, we propose a Forecast Informed Deep Learning Architecture, FIDLAR, to achieve rapid and near-optimal flood management with precise water prereleases. FIDLAR seamlessly integrates two neural network modules: one called the Flood Manager, which is responsible for generating water pre-release schedules, and another called the Flood Evaluator, which evaluates those generated schedules. The Evaluator module is pre-trained separately, and its gradient-based feedback is utilized to train the Manager model, ensuring near-optimal water pre-releases. We have conducted experiments with a flood-prone coastal area in South Florida. Results show that FIDLAR is several orders of magnitude faster than currently used physics-based approaches while outperforming baseline methods with improved water pre-release schedules. 
    more » « less