skip to main content


Title: Doppelganger Test Generation for Revealing Bugs in Autonomous Driving Software
Vehicles controlled by autonomous driving software (ADS) are expected to bring many social and economic benefits, but at the current stage not being broadly used due to concerns with regard to their safety. Virtual tests, where autonomous vehicles are tested in software simulation, are common practices because they are more efficient and safer compared to field operational tests. Specifically, search-based approaches are used to find particularly critical situations. These approaches provide an opportunity to automatically generate tests; however, systematically producing bug-revealing tests for ADS remains a major challenge. To address this challenge, we introduce DoppelTest, a test generation approach for ADSes that utilizes a genetic algorithm to discover bug-revealing violations by generating scenarios with multiple autonomous vehicles that account for traffic control (e.g., traffic signals and stop signs). Our extensive evaluation shows that DoppelTest can efficiently discover 123 bug-revealing violations for a production-grade ADS (Baidu Apollo) which we then classify into 8 unique bug categories.  more » « less
Award ID(s):
2145493 1929771 1932464
NSF-PAR ID:
10427125
Author(s) / Creator(s):
; ; ; ; ; ; ;
Date Published:
Journal Name:
International Conference on Software Engineering (ICSE)
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Self-driving cars and trucks, autonomous vehicles (AVs), should not be accepted by regulatory bodies and the public until they have much higher confidence in their safety and reliability --- which can most practically and convincingly be achieved by testing. But existing testing methods are inadequate for checking the end-to-end behaviors of AV controllers against complex, real-world corner cases involving interactions with multiple independent agents such as pedestrians and human-driven vehicles. While test-driving AVs on streets and highways fails to capture many rare events, existing simulation-based testing methods mainly focus on simple scenarios and do not scale well for complex driving situations that require sophisticated awareness of the surroundings. To address these limitations, we propose a new fuzz testing technique, called AutoFuzz, which can leverage widely-used AV simulators' API grammars to generate semantically and temporally valid complex driving scenarios (sequences of scenes). To efficiently search for traffic violations-inducing scenarios in a large search space, we propose a constrained neural network (NN) evolutionary search method to optimize AutoFuzz. Evaluation of our prototype on one state-of-the-art learning-based controller, two rule-based controllers, and one industrial-grade controller in five scenarios shows that AutoFuzz efficiently finds hundreds of traffic violations in high-fidelity simulation environments. For each scenario, AutoFuzz can find on average 10-39% more unique traffic violations than the best-performing baseline method. Further, fine-tuning the learning-based controller with the traffic violations found by AutoFuzz successfully reduced the traffic violations found in the new version of the AV controller software. 
    more » « less
  2. Self-driving cars, or Autonomous Vehicles (AVs), are increasingly becoming an integral part of our daily life. About 50 corporations are actively working on AVs, including large companies such as Google, Ford, and Intel. Some AVs are already operating on public roads, with at least one unfortunate fatality recently on record. As a result, understanding bugs in AVs is critical for ensuring their security, safety, robustness, and correctness. While previous studies have focused on a variety of domains (e.g., numerical software; machine learning; and error-handling, concurrency, and performance bugs) to investigate bug characteristics, AVs have not been studied in a similar manner. Recently, two software systems for AVs, Baidu Apollo and Autoware, have emerged as frontrunners in the opensource community and have been used by large companies and governments (e.g., Lincoln, Volvo, Ford, Intel, Hitachi, LG, and the US Department of Transportation). From these two leading AV software systems, this paper describes our investigation of 16,851 commits and 499 AV bugs and introduces our classification of those bugs into 13 root causes, 20 bug symptoms, and 18 categories of software components those bugs often affect. We identify 16 major findings from our study and draw broader lessons from them to guide the research community towards future directions in software bug detection, localization, and repair. 
    more » « less
  3. In spite of decades of research in bug detection tools, there is a surprising dearth of ground-truth corpora that can be used to evaluate the efficacy of such tools. Recently, systems such as LAVA and EvilCoder have been proposed to automatically inject bugs into software to quickly generate large bug corpora, but the bugs created so far differ from naturally occurring bugs in a number of ways. In this work, we propose a new automated bug injection system, Apocalypse, that uses formal techniques—symbolic execution, constraint-based program synthesis and model counting—to automatically inject fair (can potentially be discovered by current bug-detection tools), deep (requiring a long sequence of dependencies to be satisfied to fire), uncorrelated (each bug behaving independent of others), reproducible (a trigger input being available) and rare (can be triggered by only a few program inputs) bugs in large software code bases. In our evaluation, we inject bugs into thirty Coreutils programs as well as the TCAS test suite. We find that bugs synthesized by Apocalypse are highly realistic under a variety of metrics, that they do not favor a particular bug-finding strategy (unlike bugs produced by LAVA), and that they are more difficult to find than manually injected bugs, requiring up around 240× more tests to discover with a state-of-the-art symbolic execution tool. 
    more » « less
  4. 1
    Autonomous Driving Systems (ADS) are developing rapidly. As vehicle technology advances to SAE level 3 and above (L4, L5), there is a need to maximize and verify safety and operational benefits. As a result, maintenance of these ADS systems is essential which includes scheduled, condition-based, risk-based, and predictive maintenance. A lot of techniques and methods have been developed and are being used in the maintenance of conventional vehicles as well as other industries, but ADS is new technology and several of these maintenance types are still being developed as well as adapted for ADS. In this work, we are presenting a systematic literature review of the “State of the Art” knowledge for the maintenance of a fleet of ADS which includes fault diagnostics, prognostics, predictive maintenance, and preventive maintenance. We are providing statistical inference of different methodologies, comparison between methodologies, and providing our inference of different techniques that are used in other industries for maintenance that can be utilized for ADS. This paper presents a summary, main result, challenges, and opportunities of these approaches and supports new work for the maintenance of ADS.

     
    more » « less
  5. null (Ed.)
    Abstract In this paper, we present a large-scale measurement study of the smart TV advertising and tracking ecosystem. First, we illuminate the network behavior of smart TVs as used in the wild by analyzing network traffic collected from residential gateways. We find that smart TVs connect to well-known and platform-specific advertising and tracking services (ATSes). Second, we design and implement software tools that systematically explore and collect traffic from the top-1000 apps on two popular smart TV platforms, Roku and Amazon Fire TV. We discover that a subset of apps communicate with a large number of ATSes, and that some ATS organizations only appear on certain platforms, showing a possible segmentation of the smart TV ATS ecosystem across platforms. Third, we evaluate the (in)effectiveness of DNS-based blocklists in preventing smart TVs from accessing ATSes. We highlight that even smart TV-specific blocklists suffer from missed ads and incur functionality breakage. Finally, we examine our Roku and Fire TV datasets for exposure of personally identifiable information (PII) and find that hundreds of apps exfiltrate PII to third parties and platform domains. We also find evidence that some apps send the advertising ID alongside static PII values, effectively eliminating the user’s ability to opt out of ad personalization. 
    more » « less