Privacy-Preserving Policy Synthesis in Markov Decision Processes

Gohari, Parham; Hale, Matthew; Topcu, Ufuk

doi:10.1109/CDC42340.2020.9304015

Citation Details

Privacy-Preserving Policy Synthesis in Markov Decision Processes

In decision-making problems, the actions of an agent may reveal sensitive information that drives its decisions. For instance, a corporation’s investment decisions may reveal its sensitive knowledge about market dynamics. To prevent this type of information leakage, we introduce a policy synthesis algorithm that protects the privacy of the transition probabilities in a Markov decision process. We use differential privacy as the mathematical definition of privacy. The algorithm first perturbs the transition probabilities using a mechanism that provides differential privacy. Then, based on the privatized transition probabilities, we synthesize a policy using dynamic programming. Our main contribution is to bound the "cost of privacy," i.e., the difference between the expected total rewards with privacy and the expected total rewards without privacy. We also show that computing the cost of privacy has time complexity that is polynomial in the parameters of the problem. Moreover, we establish that the cost of privacy increases with the strength of differential privacy protections, and we quantify this increase. Finally, numerical experiments on two example environments validate the established relationship between the cost of privacy and the strength of data privacy protections. more »

Award ID(s):: 1943275

PAR ID:: 10212094

Author(s) / Creator(s):: Gohari, Parham; Hale, Matthew; Topcu, Ufuk

Date Published:: 2020-12-14

Journal Name:: Proceedings of the 2020 59th IEEE Conference on Decision and Control (CDC)

Page Range / eLocation ID:: 6266 to 6271

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.1109/CDC42340.2020.9304015

More Like this