NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

SugarC: scalable desugaring of real-world preprocessor usage into pure C

https://doi.org/10.1145/3510003.3512763

Patterson, Zachary; Zhang, Zenong; Pappas, Brent; Wei, Shiyi; Gazzillo, Paul (May 2022, ICSE '22: Proceedings of the 44th International Conference on Software Engineering)

Variability-aware analysis is critical for ensuring the quality of configurable C software. An important step toward the development of variability-aware analysis at scale is to transform real-world C software that uses both C and preprocessor into pure C code, by replacing the preprocessor's compile-time variability with C's runtime-variability. In this work, we design and implement a desugaring tool, SugarC, that transforms away real-world preprocessor usage. SugarC augments C's formal grammar specification with translation rules, performs simultaneous type checking during desugaring, and introduces numerous optimizations to address challenges that appear in real-world preprocessor usage. The experiments on DesugarBench, a benchmark consisting of 108 manually-created programs, show that SugarC supports many more language features than two existing desugaring tools. When applied on three real-world configurable C software, SugarC desugared 774 out of 813 files in the three programs, taking at most ten minutes in the worst case and less than two minutes for 95% of the C files.
more » « less
Full Text Available
Static data-flow analysis for software product lines in C: Revoking the preprocessor’s special role

https://doi.org/10.1007/s10515-022-00333-1

Schubert, Philipp Dominik; Gazzillo, Paul; Patterson, Zach; Braha, Julian; Schiebel, Fabian; Hermann, Ben; Wei, Shiyi; Bodden, Eric (May 2022, Automated Software Engineering)

Abstract Many critical codebases are written in C, and most of them use preprocessor directives to encode variability, effectively encoding software product lines. These preprocessor directives, however, challenge any static code analysis. SPLlift, a previously presented approach for analyzing software product lines, is limited to Java programs that use a rather simple feature encoding and to analysis problems with a finite and ideally small domain. Other approaches that allow the analysis of real-world C software product lines use special-purpose analyses, preventing the reuse of existing analysis infrastructures and ignoring the progress made by the static analysis community. This work presents VarAlyzer , a novel static analysis approach for software product lines. VarAlyzer first transforms preprocessor constructs to plain C while preserving their variability and semantics. It then solves any given distributive analysis problem on transformed product lines in a variability-aware manner. VarAlyzer ’s analysis results are annotated with feature constraints that encode in which configurations each result holds. Our experiments with 95 compilation units of OpenSSL show that applying VarAlyzer enables one to conduct inter-procedural, flow-, field- and context-sensitive data-flow analyses on entire product lines for the first time, outperforming the product-based approach for highly-configurable systems.
more » « less
Full Text Available
Finding broken Linux configuration specifications by statically analyzing the Kconfig language

https://doi.org/10.1145/3468264.3468578

Oh, Jeho; Yıldıran, Necip Fazıl; Braha, Julian; Gazzillo, Paul (August 2021, ESEC/FSE 2021: Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering)

Highly-configurable software underpins much of our computing infrastructure. It enables extensive reuse, but opens the door to broken configuration specifications. The configuration specification language, Kconfig, is designed to prevent invalid configurations of the Linux kernel from being built. However, the astronomical size of the configuration space for Linux makes finding specification bugs difficult by hand or with random testing. In this paper, we introduce a software model checking framework for building Kconfig static analysis tools. We develop a formal semantics of the Kconfig language and implement the semantics in a symbolic evaluator called kclause that models Kconfig behavior as logical formulas. We then design and implement a bug finder, called kismet, that takes kclause models and leverages automated theorem proving to find unmet dependency bugs. kismet is evaluated for its precision, performance, and impact on kernel development for a recent version of Linux, which has over 140,000 lines of Kconfig across 28 architecture-specific specifications. Our evaluation finds 781 bugs (151 when considering sharing among Kconfig specifications) with 100% precision, spending between 37 and 90 minutes for each Kconfig specification, although it misses some bugs due to underapproximation. Compared to random testing, kismet finds substantially more true positive bugs in a fraction of the time.
more » « less
Full Text Available
t-wise Coverage by Uniform Sampling

https://doi.org/10.1145/3336294.3342359

Oh, Jeho; Gazzillo, Paul; Batory, Don (September 2019, SPLC '19: Proceedings of the 23rd International Systems and Software Product Line Conference)

Efficiently testing large configuration spaces of Software Product Lines (SPLs) needs a sampling algorithm that is both scalable and provides good t-wise coverage. The 2019 SPLC Sampling Challenge provides large real-world feature models and asks for a t-wise sampling algorithm that can work for those models. We evaluated t-wise coverage by uniform sampling (US) the configurations of one of the provided feature models. US means that every (legal) configuration is equally likely to be selected. US yields statistically representative samples of a configuration space and can be used as a baseline to compare other sampling algorithms. We used existing algorithm called Smarch to uniformly sample SPL configurations. While uniform sampling alone was not enough to produce 100% 1-wise and 2-wise coverage, we used standard probabilistic analysis to explain our experimental results and to conjecture how uniform sampling may enhance the scalability of existing t-wise sampling algorithms.
more » « less
Full Text Available
An Empirical Study of Real-World Variability Bugs Detected by Variability-Oblivious Tools

https://doi.org/10.1145/3338906.3338967

Mordahl, Austin; Oh, Jeho; Koc, Ugur; Wei, Shiyi; Gazzillo, Paul (August 2019, ESEC/FSE 2019: Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering)

Many critical software systems developed in C utilize compile-time configurability. The many possible configurations of this software make bug detection through static analysis difficult. While variability-aware static analyses have been developed, there remains a gap between those and state-of-the-art static bug detection tools. In order to collect data on how such tools may perform and to develop real-world benchmarks, we present a way to leverage configuration sampling, off-the-shelf “variability-oblivious” bug detectors, and automatic feature identification techniques to simulate a variability-aware analysis. We instantiate our approach using four popular static analysis tools on three highly configurable, real-world C projects, obtaining 36,061 warnings, 80% of which are variability warnings. We analyze the warnings we collect from these experiments, finding that most results are variability warnings of a variety of kinds such as NULL dereference. We then manually investigate these warnings to produce a benchmark of 77 confirmed true bugs (52 of which are variability bugs) useful for future development of variability-aware analyses.
more » « less
Full Text Available
Conditional compilation is dead, long live conditional compilation!

https://doi.org/10.1109/ICSE-NIER.2019.00035

Gazzillo, Paul; Wei, Shiyi (May 2019, Proceedings - International Conference on Software Engineering)

Highly-configurable systems written in C form our most critical computing infrastructure. The preprocessor is integral to C, because conditional compilation enables such systems to produce efficient object code. However, the preprocessor makes code harder to reason about for both humans and tools. Previous approaches to this challenge developed new program analyses for unpreprocessed source code or developed new languages and constructs to replace the preprocessor. But having specialpurpose analyses means maintaining a new toolchain, while new languages face adoption challenges and do not help with existing software. We propose the best of worlds: eliminate the preprocessor but preserve its benefits. Our design replaces preprocessor usage with C itself, augmented with syntax-preserving, backwards-compatible dependent types. We discuss automated conditional compilation to replicate preprocessor performance. Our approach opens new directions for research into new compiler optimizations, dependent types for configurable software, and automated translation away from preprocessor use.
more » « less
Full Text Available
Toward detection and characterization of variability bugs in configurable C software: an empirical study

https://doi.org/10.1109/ICSE-Companion.2019.00064

Mordahl, Austin (May 2019, Proceedings - International Conference on Software Engineering)

Variability in C software is a useful tool, but critical bugs that only exist in certain configurations are easily missed by conventional debugging techniques. Even with a small number of features, the configuration space of configurable software is too large to analyze exhaustively. Variability-aware static analysis for bug detection is being developed, but remains at too early a stage to be fully usable in real-world C programs. In this work, we present a methodology of finding variability bugs by combining variability-oblivious bug detectors, static analysis of build processes, and dynamic feature interaction inference. We further present an empirical study in which we test our methodology on two highly configurable C programs. We found our methodology to be effective, finding 88 true bugs between the two programs, of which 64 were variability bugs.
more » « less
Full Text Available

Search for: All records