This report presents the results of a friendly competition for formal verification of continuous and hybrid systems with artificial intelligence (AI) components. Specifically, machine learning (ML) components in cyber-physical systems (CPS), such as feedforward neural networks used as feedback controllers in closed-loop systems are considered, which is a class of systems classically known as intelligent control systems, or in more modern and specific terms, neural network control systems (NNCS). We more broadly refer to this category as AI and NNCS (AINNCS). The friendly competition took place as part of the workshop Applied Verification for Continuous and Hybrid Systems (ARCH) in 2021. In the third edition of this AINNCS category at ARCH-COMP, three tools have been applied to solve seven different benchmark problems, (in alphabetical order): JuliaReach, NNV, and Verisig. JuliaReach is a new participant in this category, Verisig participated previously in 2019 and NNV has participated in all previous competitions. This report is a snapshot of the current landscape of tools and the types of benchmarks for which these tools are suited. Due to the diversity of problems, lack of a shared hardware platform, and the early stage of the competition, we are not ranking tools in terms of performance, yet the presented results combined with 2020 results probably provide the most complete assessment of current tools for safety verification of NNCS.
more »
« less
ARCH-COMP20 Category Report: Artificial Intelligence and Neural Network Control Systems (AINNCS) for Continuous and Hybrid Systems Plants
This report presents the results of a friendly competition for formal verification of continuous and hybrid systems with artificial intelligence (AI) components. Specifically, machine learning (ML) components in cyber-physical systems (CPS), such as feedforward neural networks used as feedback controllers in closed-loop systems are considered, which is a class of systems classically known as intelligent control systems, or in more modern and specific terms, neural network control systems (NNCS). We more broadly refer to this category as AI and NNCS (AINNCS). The friendly competition took place as part of the workshop Applied Verification for Continuous and Hybrid Systems (ARCH) in 2020. In the second edition of this AINNCS category at ARCH-COMP, four tools have been applied to solve seven different benchmark problems, (in alphabetical order): NNV, OVERT, ReachNN*, and VenMAS. This report is a snapshot of the current landscape of tools and the types of benchmarks for which these tools are suited. Due to the diversity of problems, lack of a shared hardware platform, and the early stage of the competition, we are not ranking tools in terms of performance, yet the presented results probably provide the most complete assessment of current tools for safety verification of NNCS.
more »
« less
- Award ID(s):
- 1918450
- PAR ID:
- 10195492
- Date Published:
- Journal Name:
- EPiC Series in Computing
- Volume:
- 74
- ISSN:
- 2398-7340
- Page Range / eLocation ID:
- 107 to 73
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
This report presents the results of a friendly competition for formal verification of continuous and hybrid systems with artificial intelligence (AI) components. Specifically, machine learning (ML) components in cyber-physical systems (CPS), such as feedforward neural networks used as feedback controllers in closed-loop systems, are considered, which is a class of systems classically known as intelligent control systems, or in more modern and specific terms, neural network control systems (NNCS). We broadly refer to this category as AI and NNCS (AINNCS). The friendly competition took place as part of the workshop Applied Verification for Continuous and Hybrid Systems (ARCH) in 2024. In the 8th edition of this AINNCS category at ARCH-COMP, five tools have been applied to solve 12 benchmarks, which are CORA, CROWN-Reach, GoTube, JuliaReach, and NNV. This is the year with the largest interest in the community, with two new, and three previous participants. Following last year’s trend, despite the additional challenges presented, the verification results have improved year-over-year. In terms of computation time, we can observe that the previous participants have improved as well, showing speed-ups of up to one order of magnitude, such as JuliaReach on the TORA benchmark with ReLU controller, and NNV on the TORA benchmark with both heterogeneous controllers.more » « less
-
This report presents the results of the repeatability evaluation for the 4th International Competition on Verifying Continuous and Hybrid Systems (ARCH-COMP’20). The competition took place as part of the workshop Applied Verification for Continuous and Hybrid Systems (ARCH) in 2020, affiliated with the IFAC World Congress. In its fourth edition, twenty-eight tools submitted artifacts through a Git repository for the repeatability evaluation, applied to solve benchmark problems for seven competition categories. The majority of participants adhered to the requirements for this year’s repeatability evaluation, namely to submit scripts to automatically install and execute tools in containerized virtual environments (specifically Dockerfiles to execute within Docker), and several categories used performance evaluation information from a common execution platform. The repeatability results represent a snapshot of the current landscape of tools and the types of benchmarks for which they are particularly suited and for which others may repeat their analyses. Due to the diversity of problems in verification of continuous and hybrid systems, as well as basing on standard practice in repeatability evaluations, we evaluate the tools with pass and/or failing being repeatable.more » « less
-
Deshmukh, Jyotirmoy V.; Havelund, Klaus; Perez, Ivan (Ed.)Reachability analysis is a fundamental problem in verification that checks for a given model and set of initial states if the system will reach a given set of unsafe states. Its importance lies in the ability to exhaustively explore the behaviors of a model over a finite or infinite time horizon. The problem of reachability analysis for Cyber-Physical Systems (CPS) is especially challenging because it involves reasoning about the continuous states of the system as well as its switching behavior. Each of these two aspects can by itself cause the reachability analysis problem to be undecidable. In this paper, we survey recent progress in this field beginning with the success of hybrid systems with affine dynamics. We then examine the current state-of-the-art for CPS with nonlinear dynamics and those driven by ``learning-enabled'' components such as neural networks. We conclude with an examination of some promising directions and open challenges.more » « less
-
Quantitative verification tools compute probabilities, expected rewards, or steady-state values for formal models of stochastic and timed systems. Exact results often cannot be obtained efficiently, so most tools use floating-point arithmetic in iterative algorithms that approximate the quantity of interest. Correctness is thus defined by the desired precision and determines performance. In this paper, we report on the experimental evaluation of these trade-offs performed in QComp 2020: the second friendly competition of tools for the analysis of quantitative formal models. We survey the precision guarantees—ranging from exact rational results to statistical confidence statements—offered by the nine participating tools. They gave rise to a performance evaluation using five tracks with varying correctness criteria, of which we present the results.more » « less
An official website of the United States government

