skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Model, Data and Reward Repair: Trusted Machine Learning for Markov Decision Processes
When machine learning (ML) algorithms are used in mission-critical domains (e.g., self-driving cars, cyber security) or life-critical domains (e.g., surgical robotics), it is often important to ensure that the learned models satisfy some high-level correctness requirements — these requirements can be instantiated in particular domains via constraints like safety (e.g., a robot arm should not come within five meters of any human operator during any phase of performing an autonomous operation) or liveness (e.g., a car should eventually cross a 4-way intersection). Such constraints can be formally described in propositional logic, first order logic or temporal logics such as Probabilistic Computation Tree Logic (PCTL)[31]. For example, in a lane change controller we can enforce the following PCTL safety property on seeing a slow-moving truck in front: Pr>0.99[F(changedLane or reducedSpeed)] , where F is the eventually operator in PCTL logic — this property states that the car should eventually change lanes or reduce speed with high probability (greater than 0.99). Trusted Machine Learning (TML) refers to a learning methodology that ensures that the specified properties are satisfied.  more » « less
Award ID(s):
1740079
PAR ID:
10075838
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
48th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN-W)
Page Range / eLocation ID:
194 to 199
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. While Deep Reinforcement Learning (DRL) has achieved remarkable success across various domains, it remains vulnerable to occasional catastrophic failures without additional safeguards. An effective solution to prevent these failures is to use a shield that validates and adjusts the agent’s actions to ensure compliance with a provided set of safety specifications. For real-world robotic domains, it is essential to define safety specifications over continuous state and action spaces to accurately account for system dynamics and compute new actions that minimally deviate from the agent’s original decision. In this paper, we present the first shielding approach specifically designed to ensure the satisfaction of safety requirements in continuous state and action spaces, making it suitable for practical robotic applications. Our method builds upon realizability, an essential property that confirms the shield will always be able to generate a safe action for any state in the environment. We formally prove that realizability can be verified for stateful shields, enabling the incorporation of non-Markovian safety requirements, such as loop avoidance. Finally, we demonstrate the effectiveness of our approach in ensuring safety without compromising the policy’s success rate by applying it to a navigation problem and a multi-agent particle environment1. Keywords: Shielding, Reinforcement Learning, Safety, Robotics 
    more » « less
  2. In safety-critical robotic systems, perception is tasked with representing the environment to effectively guide decision-making and plays a crucial role in ensuring that the overall system meets its requirements. To quantitatively assess the impact of object detection and classification errors on system-level performance, we present a rigorous formalism for a model of detection error, and probabilistically reason about the satisfaction of regular-safety temporal logic requirements at the system level. We also show how standard evaluation metrics for object detection, such as confusion matrices, can be represented as models of detection error, which enables the computation of probabilistic satisfaction of system-level specifications. However, traditional confusion matrices treat all detections equally, without considering their relevance to the system-level task. To address this limitation, we propose novel evaluation metrics for object detection that are informed by both the system-level task and the downstream control logic, enabling a more context-appropriate evaluation of detection models. We identify logic-based formulas relevant to the downstream control and system-level specifications and use these formulas to define a logic-based evaluation metric for object detection and classification. These logic-based metrics result in less conservative assessments of system-level performance. Finally, we demonstrate our approach on a car-pedestrian example with a leaderboard PointPillars model evaluated on the nuScenes dataset, and validate probabilistic system-level guarantees in simulation. 
    more » « less
  3. Any safety issues or cyber attacks on an Industrial Control Systems (ICS) may have catastrophic consequences on human lives and the environment. Hence, it is imperative to have resilient tools and mechanisms to protect ICS. To verify the safety and security of the control logic, complete and consistent specifications should be defined to guide the testing process. Second, it is vital to ensure that those requirements are met by the program control algorithm. In this paper, we proposed an approach to formally define the system specifications, safety, and security requirements to build an ontology that is used further to verify the control logic of the PLC software. The use of ontology allowed us to reason about semantic concepts, check the consistency of concepts, and extract specifications by inference. For the proof of concept, we studied part of an industrial chemical process to implement the proposed approach. The experimental results in this work showed that the proposed approach detects inconsistencies in the formally defined requirements and is capable of verifying the correctness and completeness of the control logic. The tools and algorithms designed and developed as part of this work will help technicians and engineers create safer and more secure control logic for ICS processes. 
    more » « less
  4. This paper studies the evaluation of learning-based object detection models in conjunction with model-checking of formal specifications defined on an abstract model of an autonomous system and its environment. In particular, we define two metrics – proposition-labeled and class-labeled confusion matrices – for evaluating object detection, and we incorporate these metrics to compute the satisfaction probability of system-level safety requirements. While confusion matrices have been effective for comparative evaluation of classification and object detection models, our framework fills two key gaps. First, we relate the performance of object detection to formal requirements defined over downstream high-level planning tasks. In particular, we provide empirical results that show that the choice of a good object detection algorithm, with respect to formal requirements on the overall system, significantly depends on the downstream planning and control design. Secondly, unlike the traditional confusion matrix, our metrics account for variations in performance with respect to the distance between the ego and the object being detected. We demonstrate this framework on a car-pedestrian example by computing the satisfaction probabilities for safety requirements formalized in Linear Temporal Logic (LTL). 
    more » « less
  5. There is a growing trend toward AI systems interacting with humans to revolutionize a range of application domains such as healthcare and transportation. However, unsafe human-machine interaction can lead to catastrophic failures. We propose a novel approach that predicts future states by accounting for the uncertainty of human interaction, monitors whether predictions satisfy or violate safety requirements, and adapts control actions based on the predictive monitoring results. Specifically, we develop a new quantitative predictive monitor based on Signal Temporal Logic with Uncertainty (STL-U) to compute a robustness degree interval, which indicates the extent to which a sequence of uncertain predictions satisfies or violates an STL-U requirement. We also develop a new loss function to guide the uncertainty calibration of Bayesian deep learning and a new adaptive control method, both of which leverage STL-U quantitative predictive monitoring results. We apply the proposed approach to two case studies: Type 1 Diabetes management and semi-autonomous driving. Experiments show that the proposed approach improves safety and effectiveness in both case studies. 
    more » « less