NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Scenic: a language for scenario specification and data generation

https://doi.org/10.1007/s10994-021-06120-5

Fremont, Daniel J.; Kim, Edward; Dreossi, Tommaso; Ghosh, Shromona; Yue, Xiangyu; Sangiovanni-Vincentelli, Alberto L.; Seshia, Sanjit A. (February 2022, Machine Learning)

Abstract We propose a new probabilistic programming language for the design and analysis of cyber-physical systems, especially those based on machine learning. We consider several problems arising in the design process, including training a system to be robust to rare events, testing its performance under different conditions, and debugging failures. We show how a probabilistic programming language can help address these problems by specifying distributions encoding interesting types of inputs, then sampling these to generate specialized training and test data. More generally, such languages can be used to write environment models, an essential prerequisite to any formal analysis. In this paper, we focus on systems such as autonomous cars and robots, whose environment at any point in time is ascene, a configuration of physical objects and agents. We design a domain-specific language,Scenic, for describingscenariosthat are distributions over scenes and the behaviors of their agents over time.Sceniccombines concise, readable syntax for spatiotemporal relationships with the ability to declaratively impose hard and soft constraints over the scenario. We develop specialized techniques for sampling from the resulting distribution, taking advantage of the structure provided byScenic’s domain-specific syntax. Finally, we applyScenicin multiple case studies for training, testing, and debugging neural networks for perception both as standalone components and within the context of a full cyber-physical system.
more » « less
Scenic: a language for scenario specification and scene generation

https://doi.org/10.1145/3314221.3314633

Fremont, Daniel J.; Dreossi, Tommaso; Ghosh, Shromona; Yue, Xiangyu; Sangiovanni-Vincentelli, Alberto L.; Seshia, Sanjit A. (June 2019, PLDI 2019 Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation)

We propose a new probabilistic programming language for the design and analysis of perception systems, especially those based on machine learning. Specifically, we consider the problems of training a perception system to handle rare events, testing its performance under different conditions, and debugging failures. We show how a probabilistic programming language can help address these problems by specifying distributions encoding interesting types of inputs and sampling these to generate specialized training and test sets. More generally, such languages can be used for cyber-physical systems and robotics to write environment models, an essential prerequisite to any formal analysis. In this paper, we focus on systems like autonomous cars and robots, whose environment is a scene, a configuration of physical objects and agents. We design a domain-specific language, Scenic, for describing scenarios that are distributions over scenes. As a probabilistic programming language, Scenic allows assigning distributions to features of the scene, as well as declaratively imposing hard and soft constraints over the scene. We develop specialized techniques for sampling from the resulting distribution, taking advantage of the structure provided by Scenic's domain-specific syntax. Finally, we apply Scenic in a case study on a convolutional neural network designed to detect cars in road images, improving its performance beyond that achieved by state-of-the-art synthetic data generation methods.
more » « less
Full Text Available
A LiDAR Point Cloud Generator: from a Virtual World to Autonomous Driving

https://doi.org/10.1145/3206025.3206080

Yue, Xiangyu; Wu, Bichen; Seshia, Sanjit A.; Keutzer, Kurt; Sangiovanni-Vincentelli, Alberto L. (June 2018, ICMR '18 Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval)

3D LiDAR scanners are playing an increasingly important role in autonomous driving as they can generate depth information of the environment. However, creating large 3D LiDAR point cloud datasets with point-level labels requires a significant amount of manual annotation. This jeopardizes the efficient development of supervised deep learning algorithms which are often data-hungry. We present a framework to rapidly create point clouds with accurate pointlevel labels from a computer game. To our best knowledge, this is the first publication on LiDAR point cloud simulation framework for autonomous driving. The framework supports data collection from both auto-driving scenes and user-configured scenes. Point clouds from auto-driving scenes can be used as training data for deep learning algorithms, while point clouds from user-configured scenes can be used to systematically test the vulnerability of a neural network, and use the falsifying examples to make the neural network more robust through retraining. In addition, the scene images can be captured simultaneously in order for sensor fusion tasks, with a method proposed to do automatic registration between the point clouds and captured scene images. We show a significant improvement in accuracy (+9%) in point cloud segmentation by augmenting the training dataset with the generated synthesized data. Our experiments also show by testing and retraining the network using point clouds from user-configured scenes, the weakness/blind spots of the neural network can be fixed.
more » « less
Full Text Available
Counterexample-guided data augmentation

Dreossi, Tommaso; Ghosh, Shromona; Yue, Xiangyu; Keutzer, Kurt; Sangiovanni-Vincentelli, Alberto L; Seshia, Sanjit A. (July 2018, Proceedings of the ... International Conference on Artificial Intelligence)

We present a novel framework for augmenting data sets for machine learning based on counterexamples. Counterexamples are misclassified examples that have important properties for retraining and improving the model. Key components of our framework include a counterexample generator, which produces data items that are misclassified by the model and error tables, a novel data structure that stores information pertaining to misclassifications. Error tables can be used to explain the model's vulnerabilities and are used to efficiently generate counterexamples for augmentation. We show the efficacy of the proposed framework by comparing it to classical augmentation techniques on a case study of object detection in autonomous driving based on deep neural networks.
more » « less
Full Text Available
SMC: Satisfiability Modulo Convex Programming

https://doi.org/10.1109/JPROC.2018.2849003

Shoukry, Yasser; Nuzzo, Pierluigi; Sangiovanni-Vincentelli, Alberto L.; Seshia, Sanjit A.; Pappas, George J.; Tabuada, Paulo (January 2018, Proceedings of the IEEE)

The design of cyber-physical systems (CPSs) requires methods and tools that can efficiently reason about the interaction between discrete models, e.g., representing the behaviors of ``cyber'' components, and continuous models of physical processes. Boolean methods such as satisfiability (SAT) solving are successful in tackling large combinatorial search problems for the design and verification of hardware and software components. On the other hand, problems in control, communications, signal processing, and machine learning often rely on convex programming as a powerful solution engine. However, despite their strengths, neither approach would work in isolation for CPSs. In this paper, we present a new satisfiability modulo convex programming (SMC) framework that integrates SAT solving and convex optimization to efficiently reason about Boolean and convex constraints at the same time. We exploit the properties of a class of logic formulas over Boolean and nonlinear real predicates, termed monotone satisfiability modulo convex formulas, whose satisfiability can be checked via a finite number of convex programs. Following the lazy satisfiability modulo theory (SMT) paradigm, we develop a new decision procedure for monotone SMC formulas, which coordinates SAT solving and convex programming to provide a satisfying assignment or determine that the formula is unsatisfiable. A key step in our coordination scheme is the efficient generation of succinct infeasibility proofs for inconsistent constraints that can support conflict-driven learning and accelerate the search. We demonstrate our approach on different CPS design problems, including spacecraft docking mission control, robotic motion planning, and secure state estimation. We show that SMC can handle more complex problem instances than state-of-the-art alternative techniques based on SMT solving and mixed integer convex programming.
more » « less
Full Text Available

Search for: All records