skip to main content


Title: Connecting Software Reliability Growth Models to Software Defect Tracking
Traditional software reliability growth models only consider defect discovery data, yet the practical concern of software engineers is the removal of these defects. Most attempts to model the relationship between defect discovery and resolution have been restricted to differential equation-based models associated with these two activities. However, defect tracking databases offer a practical source of information on the defect lifecycle suitable for more complete reliability and performance models. This paper explicitly connects software reliability growth models to software defect tracking. Data from a NASA project has been employed to develop differential equation-based models of defect discovery and resolution as well as distributional and Markovian models of defect resolution. The states of the Markov model represent thirteen unique stages of the NASA software defect lifecycle. Both state transition probabilities and transition time distributions are computed from the defect database. Illustrations compare the predictive and computational performance of alternative approaches. The results suggest that the simple distributional approach achieves the best tradeoff between these two performance measures, but that enhanced data collection practices could improve the utility of the more advanced approaches and the inferences they enable.  more » « less
Award ID(s):
1749635
NSF-PAR ID:
10221046
Author(s) / Creator(s):
; ; ; ; ;
Date Published:
Journal Name:
2020 IEEE 31st International Symposium on Software Reliability Engineering (ISSRE)
Page Range / eLocation ID:
138 to 147
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Recent research applies soft computing techniques to fit software reliability growth models. However, runtime performance and the distribution of the distance from an optimal solution over multiple runs must be explicitly considered to justify the practical utility of these approaches, promote comparison, and support reproducible research. This paper presents a meta-optimization framework to design stable and efficient multi-phase algorithms for fitting software reliability growth models. The approach combines initial parameter estimation techniques from statistical algorithms, the global search properties of soft computing, and the rapid convergence of numerical methods. Designs that exhibit the best balance between runtime performance and accuracy are identified. The approach is illustrated through nonhomogeneous Poisson process and covariate software reliability growth models, including a cross-validation step on data sets not used to identify designs. The results indicate the nonhomogeneous Poisson process model considered is too simple to benefit from soft computing because it incurs additional runtime with no increase in accuracy attained. However, a multi-phase design for the covariate software reliability growth model consisting of the bat algorithm followed by a numerical method achieves better performance and converges consistently, compared to a numerical method only. The proposed approach supports higher dimensional covariate software reliability growth model fitting suitable for implementation in a tool. 
    more » « less
  2. Researchers have proposed several software reliability growth models, many of which possess complex parametric forms. In practice, software reliability growth models should exhibit a balance between predictive accuracy and other statistical measures of goodness of fit, yet past studies have not always performed such balanced assessment. This paper proposes a framework for software reliability growth models possessing a bathtub-shaped fault detection rate and derives stable and efficient expectation conditional maximization algorithms to enable the fitting of these models. The stages of the bathtub are interpreted in the context of the software testing process. The illustrations compare multiple bathtub-shaped and reduced model forms, including classical models with respect to predictive and information theoretic measures. The results indicate that software reliability growth models possessing a bathtub-shaped fault detection rate outperformed classical models on both types of measures. The proposed framework and models may therefore be a practical compromise between model complexity and predictive accuracy. 
    more » « less
  3. Photogrammetric data collection and analysis techniques are increasingly being used for geotechnical characterization of rock masses, and rock slopes, in particular. There is a growing selection of software packages that can create georeferenced digital 3D models from a photoset and control points. Although each software package is able to create the desired point clouds, different techniques are used to produce them. For a geotechnical investigation, it is important to understand the accuracy of the software being used in order to have confidence in the reliability of the digital 3D models that are created. In a study similar to one conducted in conjunction with the GoldenRocks ARMA conference in 2006 (and described in Tonon and Kottenstette, 2006), a rock outcrop was selected to be the location for a digital photogrammetry model comparison. Two sets of control points were surveyed on the rock outcrop; one set was provided for the creation of each model, and one set was used to evaluate the accuracy of the model by measuring the difference in the location of the point in the model and in the survey data. An unmanned aerial vehicle (UAV) was used to collect video footage of the site. A set of still frames were extracted from the video that contain overlapping images of the rock outcrop. The set of image files was used to create models with the following photogrammetry software packages: Bentley ContextCapture, Agisoft PhotoScan, and Pix4Dmapper. The accuracy of each of the software packages was compared by quantifying the error in the control points and check points between the model and the field survey. As this comparison is intended to provide guidance for selecting software tools to aid in rock mass characterization, other features were evaluated as well, including user-friendliness. Understanding the accuracy of digital photogrammetry software is critical for justifying the use of such models in a geotechnical investigation. The advantages of these models are numerous but of little value if the data provided by the models do not adequately represent the field conditions. Bentley ContextCapture was found to have the least error in the control points and Pix4Dmapper was found to have the least error in the check points. The Bentley ContextCapture model also had the highest resolution, closely followed by the Pix4Dmapper model. Based on these qualities and several others including the general usability, Bentley ContextCapture creates the most effective models for potential geotechnical investigations. 
    more » « less
  4. Abstract

    The performance of computational methods and software to identify differentially expressed features in single‐cell RNA‐sequencing (scRNA‐seq) has been shown to be influenced by several factors, including the choice of the normalization method used and the choice of the experimental platform (or library preparation protocol) to profile gene expression in individual cells. Currently, it is up to the practitioner to choose the most appropriate differential expression (DE) method out of over 100 DE tools available to date, each relying on their own assumptions to model scRNA‐seq expression features. To model the technological variability in cross‐platform scRNA‐seq data, here we propose to use Tweedie generalized linear models that can flexibly capture a large dynamic range of observed scRNA‐seq expression profiles across experimental platforms induced by platform‐ and gene‐specific statistical properties such as heavy tails, sparsity, and gene expression distributions. We also propose a zero‐inflated Tweedie model that allows zero probability mass to exceed a traditional Tweedie distribution to model zero‐inflated scRNA‐seq data with excessive zero counts. Using both synthetic and published plate‐ and droplet‐based scRNA‐seq datasets, we perform a systematic benchmark evaluation of more than 10 representative DE methods and demonstrate that our method (Tweedieverse) outperforms the state‐of‐the‐art DE approaches across experimental platforms in terms of statistical power and false discovery rate control. Our open‐source software (R/Bioconductor package) is available athttps://github.com/himelmallick/Tweedieverse.

     
    more » « less
  5. Abstract

    We introduce the Weak-form Estimation of Nonlinear Dynamics (WENDy) method for estimating model parameters for non-linear systems of ODEs. Without relying on any numerical differential equation solvers, WENDy computes accurate estimates and is robust to large (biologically relevant) levels of measurement noise. For low dimensional systems with modest amounts of data, WENDy is competitive with conventional forward solver-based nonlinear least squares methods in terms of speed and accuracy. For both higher dimensional systems and stiff systems, WENDy is typically both faster (often by orders of magnitude) and more accurate than forward solver-based approaches. The core mathematical idea involves an efficient conversion of the strong form representation of a model to its weak form, and then solving a regression problem to perform parameter inference. The core statistical idea rests on the Errors-In-Variables framework, which necessitates the use of the iteratively reweighted least squares algorithm. Further improvements are obtained by using orthonormal test functions, created from a set of$$C^{\infty }$$Cbump functions of varying support sizes.We demonstrate the high robustness and computational efficiency by applying WENDy to estimate parameters in some common models from population biology, neuroscience, and biochemistry, including logistic growth, Lotka-Volterra, FitzHugh-Nagumo, Hindmarsh-Rose, and a Protein Transduction Benchmark model. Software and code for reproducing the examples is available athttps://github.com/MathBioCU/WENDy.

     
    more » « less