Model, Data and Reward Repair: Trusted Machine Learning for Markov Decision Processes

Ghosh, Shalini; Jha, Susmit; Tiwari, Ashish; Lincoln, Patrick; Zhu, Xiaojin

doi:10.1109/DSN-W.2018.00064

Citation Details

Model, Data and Reward Repair: Trusted Machine Learning for Markov Decision Processes

When machine learning (ML) algorithms are used in mission-critical domains (e.g., self-driving cars, cyber security) or life-critical domains (e.g., surgical robotics), it is often important to ensure that the learned models satisfy some high-level correctness requirements — these requirements can be instantiated in particular domains via constraints like safety (e.g., a robot arm should not come within five meters of any human operator during any phase of performing an autonomous operation) or liveness (e.g., a car should eventually cross a 4-way intersection). Such constraints can be formally described in propositional logic, first order logic or temporal logics such as Probabilistic Computation Tree Logic (PCTL)[31]. For example, in a lane change controller we can enforce the following PCTL safety property on seeing a slow-moving truck in front: Pr>0.99[F(changedLane or reducedSpeed)] , where F is the eventually operator in PCTL logic — this property states that the car should eventually change lanes or reduce speed with high probability (greater than 0.99). Trusted Machine Learning (TML) refers to a learning methodology that ensures that the specified properties are satisfied. more »

Award ID(s):: 1740079

PAR ID:: 10075838

Author(s) / Creator(s):: Ghosh, Shalini; Jha, Susmit; Tiwari, Ashish; Lincoln, Patrick; Zhu, Xiaojin

Date Published:: 2018-01-01

Journal Name:: 48th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN-W)

Page Range / eLocation ID:: 194 to 199

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.1109/DSN-W.2018.00064

More Like this