skip to main content


Title: Scenic: a language for scenario specification and data generation
Abstract

We propose a new probabilistic programming language for the design and analysis of cyber-physical systems, especially those based on machine learning. We consider several problems arising in the design process, including training a system to be robust to rare events, testing its performance under different conditions, and debugging failures. We show how a probabilistic programming language can help address these problems by specifying distributions encoding interesting types of inputs, then sampling these to generate specialized training and test data. More generally, such languages can be used to write environment models, an essential prerequisite to any formal analysis. In this paper, we focus on systems such as autonomous cars and robots, whose environment at any point in time is ascene, a configuration of physical objects and agents. We design a domain-specific language,Scenic, for describingscenariosthat are distributions over scenes and the behaviors of their agents over time.Sceniccombines concise, readable syntax for spatiotemporal relationships with the ability to declaratively impose hard and soft constraints over the scenario. We develop specialized techniques for sampling from the resulting distribution, taking advantage of the structure provided byScenic’s domain-specific syntax. Finally, we applyScenicin multiple case studies for training, testing, and debugging neural networks for perception both as standalone components and within the context of a full cyber-physical system.

 
more » « less
Award ID(s):
1837132
NSF-PAR ID:
10362308
Author(s) / Creator(s):
; ; ; ; ; ;
Publisher / Repository:
Springer Science + Business Media
Date Published:
Journal Name:
Machine Learning
Volume:
112
Issue:
10
ISSN:
0885-6125
Page Range / eLocation ID:
p. 3805-3849
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We propose a new probabilistic programming language for the design and analysis of perception systems, especially those based on machine learning. Specifically, we consider the problems of training a perception system to handle rare events, testing its performance under different conditions, and debugging failures. We show how a probabilistic programming language can help address these problems by specifying distributions encoding interesting types of inputs and sampling these to generate specialized training and test sets. More generally, such languages can be used for cyber-physical systems and robotics to write environment models, an essential prerequisite to any formal analysis. In this paper, we focus on systems like autonomous cars and robots, whose environment is a scene, a configuration of physical objects and agents. We design a domain-specific language, Scenic, for describing scenarios that are distributions over scenes. As a probabilistic programming language, Scenic allows assigning distributions to features of the scene, as well as declaratively imposing hard and soft constraints over the scene. We develop specialized techniques for sampling from the resulting distribution, taking advantage of the structure provided by Scenic's domain-specific syntax. Finally, we apply Scenic in a case study on a convolutional neural network designed to detect cars in road images, improving its performance beyond that achieved by state-of-the-art synthetic data generation methods. 
    more » « less
  2. Today’s STEM classrooms have expanded the domain of computer science education from a basic two-toned terminal screen to now include helpful Integrated Development Environments(IDE) (BlueJ, Eclipse), block-based programming (MIT Scratch, Greenfoot), and even physical computing with embedded systems (Arduino, LEGO Mindstorm). But no matter which environment a student starts programming in, all students will eventually need help in finding and fixing bugs in their code. While the helpful IDE’s have debugger tools built in (breakpoints for pausing your program, ways to view/modify variable values, and "stepping" through code execution), in many of the other programming environments, students are limited to using print statements to try and "see" what is happening inside their program. Most students who learn to write code for Arduino microcontrollers will start within the Arduino IDE, but the official Arduino IDE does not currently provide any debugging tools. Instead, a student would have to move on to a professional IDE such as Atmel Studio or acquire a hardware debugger in order to add breakpoints or view their program’s variables. But each of these options has a steep learning curve, additional costs, and can require complex configurations. Based on research of student debugging practices[3, 7] and our own classroom observations, we have developed an Arduino software library, called Arduino Debugger, which provides some of these debugging tools (ex. breakpoints) while staying within the official Arduino IDE. This work continues a previous library, (redacted), which focused on features specific to e-textiles development boards. The Arduino Debugger library has been modified to support not only e-textile boards (Lilypad, Adafruit Circuit Playground) but most AVR and ARM based Arduino boards.We are also in the process of testing a set of Debugging Code Templates to see how they might increase student adoption of debugging tools. 
    more » « less
  3. Grounding natural language instructions on the web to perform previously unseen tasks enables accessibility and automation. We introduce a task and dataset to train AI agents from open-domain, step-by-step instructions originally written for people. We build RUSS (Rapid Universal Support Service) to tackle this problem. RUSS consists of two models: First, a BERT-LSTM with pointers parses instructions to ThingTalk, a domain-specific language we design for grounding natural language on the web. Then, a grounding model retrieves the unique IDs of any webpage elements requested in ThingTalk. RUSS may interact with the user through a dialogue (e.g. ask for an address) or execute a web operation (e.g. click a button) inside the web runtime. To augment training, we synthesize natural language instructions mapped to ThingTalk. Our dataset consists of 80 different customer service problems from help websites, with a total of 741 step-by-step instructions and their corresponding actions. RUSS achieves 76.7% end-to-end accuracy predicting agent actions from single instructions. It outperforms state-of-the-art models that directly map instructions to actions without ThingTalk. Our user study shows that RUSS is preferred by actual users over web navigation. 
    more » « less
  4. As specialized hardware accelerators like FPGAs become a prominent part of the current computing landscape, software applications are increasingly constructed to leverage heterogeneous architectures. Such a trend is already happening in the domain of machine learning and Internet-of-Things (IoT) systems built on edge devices. Yet, debugging and testing methods for heterogeneous applications are currently lacking. These applications may look similar to regular C/C++ code but include hardware synthesis details in terms of preprocessor directives. Therefore, their behavior under heterogeneous architectures may diverge significantly from CPU due to hardware synthesis details. Further, the compilation and hardware simulation cycle takes an enormous amount of time, prohibiting frequent invocations required for fuzz testing. We propose a novel fuzz testing technique, called HeteroFuzz, designed to specifically target heterogeneous applications and to detect platform-dependent divergence. The key essence of HeteroFuzz is that it uses a three-pronged approach to reduce the long latency of repetitively invoking a hardware simulator on a heterogeneous application. First, in addition to monitoring code coverage as a fuzzing guidance mechanism, we analyze synthesis pragmas in kernel code and monitor accelerator-relevant value spectra. Second, we design dynamic probabilistic mutations to increase the chance of hitting divergent behavior under different platforms. Third, we memorize the boundaries of seen kernel inputs and skip HLS simulator invocation if it can expose only redundant divergent behavior. We evaluate HeteroFuzz on seven real-world heterogeneous applications with FPGA kernels. HeteroFuzz is 754X faster in exposing the same set of distinct divergence symptoms than naive fuzzing. Probabilistic mutations contribute to 17.5X speed up than the one without. Selective invocation of HLS simulation contributes to 8.8X speed up than the one without. 
    more » « less
  5. Abstract  
    more » « less