Context The extent of post-release use of software affects the number of faults, thus biasing quality metrics and adversely affecting associated decisions. The proprietary nature of usage data limited deeper exploration of this subject in the past. Objective To determine how software faults and software use are related and how, based on that, an accurate quality measure can be designed. Method Via Google Analytics we measure new users, usage intensity, usage frequency, exceptions, and release date and duration for complex proprietary mobile applications for Android and iOS. We utilize Bayesian Network and Random Forest models to explain the interrelationships and to derive the usage independent release quality measure. To increase external validity, we also investigate the interrelationship among various code complexity measures, usage (downloads), and number of issues for 520 NPM packages. We derived a usage-independent quality measure from these analyses, and applied it on 4430 popular NPM packages to construct timelines for comparing the perceived quality (number of issues) and our derived measure of quality during the lifetime of these packages. Results We found the number of new users to be the primary factor determining the number of exceptions, and found no direct link between the intensity and frequency of software usage and software faults. Crashes increased with the power of 1.02-1.04 of new user for the Android app and power of 1.6 for the iOS app. Release quality expressed as crashes per user was independent of other usage-related predictors, thus serving as a usage independent measure of software quality. Usage also affected quality in NPM, where downloads were strongly associated with numbers of issues, even after taking the other code complexity measures into consideration. Unlike in mobile case where exceptions per user decrease over time, for 45.8% of the NPM packages the number of issues per download increase. Conclusions We expect our result and our proposed quality measure will help gauge release quality of a software more accurately and inspire further research in this area.
more »
« less
Do code review measures explain the incidence of post-release defects?
Aim In contrast to studies of defects found during code review, we aim to clarify whether code review measures can explain the prevalence of post-release defects. Method We replicate McIntosh et al.’s (Empirical Softw. Engg. 21(5): 2146–2189, 2016) study that uses additive regression to model the relationship between defects and code reviews. To increase external validity, we apply the same methodology on a new software project. We discuss our findings with the first author of the original study, McIntosh. We then investigate how to reduce the impact of correlated predictors in the variable selection process and how to increase understanding of the inter-relationships among the predictors by employing Bayesian Network (BN) models. Context As in the original study, we use the same measures authors obtained for Qt project in the original study. We mine data from version control and issue tracker of Google Chrome and operationalize measures that are close analogs to the large collection of code, process, and code review measures used in the replicated the study. Results Both the data from the original study and the Chrome data showed high instability of the influence of code review measures on defects with the results being highly sensitive to variable selection procedure. Models without code review predictors had as good or better fit than those with review predictors. Replication, however, confirms with the bulk of prior work showing that prior defects, module size, and authorship have the strongest relationship to post-release defects. The application of BN models helped explain the observed instability by demonstrating that the review-related predictors do not affect post-release defects directly and showed indirect effects. For example, changes that have no review discussion tend to be associated with files that have had many prior defects which in turn increase the number of post-release defects. We hope that similar analyses of other software engineering techniques may also yield a more nuanced view of their impact. Our replication package including our data and scripts is publicly available (Replication package 2018).
more »
« less
- PAR ID:
- 10177641
- Date Published:
- Journal Name:
- Empirical software engineering
- Volume:
- 29
- ISSN:
- 1382-3256
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract Social capital—the strength of an individual’s social network and community—has been identified as a potential determinant of outcomes ranging from education to health 1–8 . However, efforts to understand what types of social capital matter for these outcomes have been hindered by a lack of social network data. Here, in the first of a pair of papers 9 , we use data on 21 billion friendships from Facebook to study social capital. We measure and analyse three types of social capital by ZIP (postal) code in the United States: (1) connectedness between different types of people, such as those with low versus high socioeconomic status (SES); (2) social cohesion, such as the extent of cliques in friendship networks; and (3) civic engagement, such as rates of volunteering. These measures vary substantially across areas, but are not highly correlated with each other. We demonstrate the importance of distinguishing these forms of social capital by analysing their associations with economic mobility across areas. The share of high-SES friends among individuals with low SES—which we term economic connectedness—is among the strongest predictors of upward income mobility identified to date 10,11 . Other social capital measures are not strongly associated with economic mobility. If children with low-SES parents were to grow up in counties with economic connectedness comparable to that of the average child with high-SES parents, their incomes in adulthood would increase by 20% on average. Differences in economic connectedness can explain well-known relationships between upward income mobility and racial segregation, poverty rates, and inequality 12–14 . To support further research and policy interventions, we publicly release privacy-protected statistics on social capital by ZIP code at https://www.socialcapital.org .more » « less
-
Background: The way post-release usage of a software affects the number of faults experienced by users is scarcely explored due to the proprietary nature of such data. The commonly used quality measure of post-release faults may, therefore, reflect usage instead of the quality of the software development process. Aim: To determine how software faults and software use are related in a post-deployment scenario and, based on that, derive post-deployment quality measure that reflects developers' performance more accurately. Method: We analyze Google Analytics data counting daily new users, visits, time-on-site, visits per user, and release start date and duration for 169 releases of a complex communication application for Android OS. We utilize Linear Regression, Bayesian Network, and Random Forest models to explain the interrelationships and to derive release quality measure that is relatively stable with respect to variations in software usage. Results: We found the number of new users and release start date to be the determining factors for the number of exceptions, and found no direct link between the intensity and frequency of software usage and software faults. Furthermore, the relative increase in the number of crashes was found to be stably associated with a power of 1.3 relative increase in the number of new users. Based on the findings we propose a release quality measure: number of crashes per user for a release of the software, which was seen to be independent of any other usage variables, providing us with a usage independent measure of software quality. Conclusions: We expect our result and our proposed quality measure will help gauge release quality of a software more accurately and inspire further research in this area.more » « less
-
Intrinsic defects and their concentrations in hexagonal boron nitride (h‐BN) play a key role in single‐photon emission. In this study, the optical properties of large‐area multilayer h‐BN‐on‐sapphire grown by metal‐organic chemical vapor deposition are explored. Based on the detailed spectroscopic characterization using both cathodoluminescence (CL) and photoluminescence (PL) measurements, the material is devoid of random single‐point defects instead of a few clustered complex defects. The emission spectra of the measurements confirm a record‐low‐defect concentration of ≈104 cm−2. Post‐annealing, no significant changes are observed in the measured spectra and the defect concentrations remain unaltered. Through CL and PL spectroscopy, an optically active boron vacancy spin defect is identified and a novel complex defect combination arising from carbon impurities is revealed. This complex defect, previously unreported, signifies a unique aspect of the material. In these findings, the understanding of defect‐induced optical properties in h‐BN films is contributed, providing insights for potential applications in quantum information science.more » « less
-
Due to the ever-increasing number of security breaches, practitioners are motivated to produce more secure software. In the United States, the White House Office released a memorandum on Executive Order (EO) 14028 that mandates organizations provide self-attestation of the use of secure software development practices. The OpenSSF Scorecard project allows practitioners to measure the use of software security practices automatically. However, little research has been done to determine whether the use of security practices improves package security, particularly which security practices have the biggest impact on security outcomes. The goal of this study is to assist practitioners and researchers in making informed decisions on which security practices to adopt through the development of models between software security practice scores and security vulnerability counts. To that end, we developed five supervised machine learning models for npm and PyPI packages using the OpenSSF Scorecard security practices scores and aggregate security scores as predictors and the number of externally-reported vulnerabilities as a target variable. Our models found that four security practices (Maintained, Code Review, Branch Protection, and Security Policy) were the most important practices influencing vulnerability count. However, we had low R2 (ranging from 9% to 12%) when we tested the models to predict vulnerability counts. Additionally, we observed that the number of reported vulnerabilities increased rather than reduced as the aggregate security score of the packages increased. Both findings indicate that additional factors may influence the package vulnerability count. Other factors, such as the scarcity of vulnerability data, time to implicate security practices vs. time to detect vulnerabilities, and the need for more adequate scripts to detect security practices, may impede the data-driven studies to indicate that practice can aid in reducing externally-reported vulnerabilities. We suggest that vulnerability count and security score data be refined so that these measures can be used to provide actionable guidance on security practices.more » « less
An official website of the United States government

