Regression testing---rerunning tests on each code version to detect newly-broken functionality---is important and widely practiced. But, regression testing is costly due to the large number of tests and the high frequency of code changes. Regression test selection (RTS) optimizes regression testing by only rerunning a subset of tests that can be affected by changes. Researchers showed that RTS based on program analysis can save substantial testing time for (medium-sized) open-source projects. Practitioners also showed that RTS based on machine learning (ML) works well on very large code repositories, e.g., in Facebook's monorepository. We combine analysis-based RTS and ML-based RTS by using the latter to choose a subset of tests selected by the former. We first train several novel ML models to learn the impact of code changes on test outcomes using a training dataset that we obtain via mutation analysis. Then, we evaluate the benefits of combining ML models with analysis-based RTS on 10 projects, compared with using each technique alone. Combining ML-based RTS with two analysis-based RTS techniques-Ekstazi and STARTS-selects 25.34% and 21.44% fewer tests, respectively.
more »
« less
Comparing and Combining Analysis-Based and Learning-Based Regression Test Selection
Regression testing - rerunning tests at each code version to detect newly-broken functionality - is important and widely practiced. But regression testing is costly due to the large number of tests and high frequency of code changes. Regression test selection (RTS) optimizes regression testing by rerunning only a subset of tests that can be affected by code changes. Researchers showed that RTS based on dynamic and static program analysis can save substantial testing time for (medium-sized) open-source projects. Simultaneously, practitioners showed that RTS based on machine learning (ML) is lightweight and works well on very large software repositories, e.g., in Facebook’s monorepository. We combine analysis-based RTS and ML-based RTS by using ML-based RTS to choose a subset of tests selected by analysis-based RTS. To do so, we first design several novel ML-based RTS techniques that leverage mutation analysis to obtain a training set for learning the impact of code changes on test outcomes. Then, we empirically evaluate, using 10 projects, the benefits of combining various ML models with analysis-based RTS. We also compare combining the techniques with using each technique individually. Combining ML-based RTS with two analysis-based RTS techniques - Ekstazi and STARTS - selects 25.34% and 21.44% fewer tests.
more »
« less
- Award ID(s):
- 2045596
- PAR ID:
- 10326760
- Date Published:
- Journal Name:
- ICSE Workshop on Automation of Software Test
- ISSN:
- 2640-6993
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Regression test selection (RTS) speeds up regression testing by only re-running tests that might be affected by code changes. Ideal RTS safely selects all affected tests and precisely selects only affected tests. But, aiming for this ideal is often slower than re-running all tests. So, recent RTS techniques use program analysis to trade precision for speed, i.e., lower regression testing time, or even use machine learning to trade safety for speed. We seek to make recent analysis-based RTS techniques more precise, to further speed up regression testing. Independent studies suggest that these techniques reached a “performance wall” in the speed-ups that they provide. We manually inspect code changes to discover those that do not require re-running tests that are only affected by such changes. We categorize 29 kinds of changes that we found from five projects into 13 findings, 11 of which are semantics-modifying. We enhance two RTS techniques—Ekstazi and STARTS—to reason about our findings. Using 1,150 versions of 23 projects, we evaluate the impact on safety and precision of leveraging such changes. We also evaluate if our findings from a few projects can speed up regression testing in other projects. The results show that our enhancements are effective and they can generalize. On average, they result in selecting 41.7% and 31.8% fewer tests, and take 33.7% and 28.7% less time than Ekstazi and STARTS, respectively, with no loss in safety.more » « less
-
Regression test selection (RTS) speeds up regression testing by only re-running tests that might be affected by code changes. Ideal RTS safely selects all affected tests and precisely selects only affected tests. But, aiming for this ideal is often slower than re-running all tests. So, recent RTS techniques use program analysis to trade precision for speed, i.e., lower regression testing time, or even use machine learning to trade safety for speed. We seek to make recent analysis-based RTS techniques more precise, to further speed up regression testing. Independent studies suggest that these techniques reached a “performance wall” in the speed-ups that they provide. We manually inspect code changes to discover those that do not require re-running tests that are only affected by such changes. We categorize 29 kinds of changes that we find from five projects into 13 findings, 11 of which are semantics-modifying. We enhance two RTS techniques—Ekstazi and STARTS—to reason about our findings. Using 1,150 versions of 23 projects, we evaluate the impact on safety and precision of leveraging such changes. We also evaluate if our findings from a few projects can speed up regression testing in other projects. The results show that our enhancements are effective and they can generalize. On average, they result in selecting 41.7% and 31.8% fewer tests, and take 33.7% and 28.7% less time than Ekstazi and STARTS, respectively, with no loss in safety.more » « less
-
Regression testing - running available tests after each project change - is widely practiced in industry. Despite its widespread use and importance, regression testing is a costly activity. Regression test selection (RTS) optimizes regression testing by selecting only tests affected by project changes. RTS has been extensively studied and several tools have been deployed in large projects. However, work on RTS over the last decade has mostly focused on languages with abstract computing machines(e.g., JVM). Meanwhile development practices (e.g., frequency of commits, testing frameworks, compilers) in C++ projects have dramatically changed and the way we should design and implement RTS tools and the benefits of those tools is unknown. We present a design and implementation of an RTS technique, dubbed RTS++, that targets projects written in C++, which compile to LLVM IR and use the Google Test testing framework. RTS++ uses static analysis of a function call graph to select tests. RTS++ integrates with many existing build systems, including AutoMake, CMake, and Make. We evaluated RTS++ on 11 large open-source projects, totaling 3,811,916 lines of code. To the best of our knowledge, this is the largest evaluation of an RTS technique for C++. We measured the benefits of RTS++compared to running all available tests (i.e., retest-all). Our results show that RTS++ reduces the number of executed tests and end-to-end testing time by 88% and 61% on average.more » « less
-
Continuous integration (CI) has become a popular method for automating code changes, testing, and software project delivery. However, sufficient testing prior to code submission is crucial to prevent build breaks. Additionally, testing must provide developers with quick feedback on code changes, which requires fast testing times. While regression test selection (RTS) has been studied to improve the cost-effectiveness of regression testing for lower-level tests (i.e., unit tests), it has not been applied to the testing of user interfaces (UI) in application domains such as mobile apps. UI testing at the UI level requires different techniques such as impact analysis and automated test execution. In this paper, we examine the use of RTS in CI settings for UI testing across various open-source mobile apps. Our analysis focuses on using Frequency Analysis to understand the need for RTS, Cost Analysis to evaluate the cost of impact analysis and test case selection algorithms, and Test Reuse Analysis to determine the reusability of UI test sequences for automation. The insights from this study will guide practitioners and researchers in developing advanced RTS techniques that can be adapted to CI environments for mobile apps.more » « less