Privacy-Engineered Value Decomposition Networks for Cooperative Multi-Agent Reinforcement Learning

Gohari, Parham; Hale, Matthew; Topcu, Ufuk

doi:10.1109/CDC49753.2023.10384184

Citation Details

This content will become publicly available on December 13, 2024

Privacy-Engineered Value Decomposition Networks for Cooperative Multi-Agent Reinforcement Learning

In cooperative multi-agent reinforcement learning (Co-MARL), a team of agents must jointly optimize the team's longterm rewards to learn a designated task. Optimizing rewards as a team often requires inter-agent communication and data sharing, leading to potential privacy implications. We assume privacy considerations prohibit the agents from sharing their environment interaction data. Accordingly, we propose Privacy-Engineered Value Decomposition Networks (PE-VDN), a Co-MARL algorithm that models multi-agent coordination while provably safeguarding the confidentiality of the agents' environment interaction data. We integrate three privacy-engineering techniques to redesign the data flows of the VDN algorithm-an existing Co-MARL algorithm that consolidates the agents' environment interaction data to train a central controller that models multi-agent coordination-and develop PE-VDN. In the first technique, we design a distributed computation scheme that eliminates Vanilla VDN's dependency on sharing environment interaction data. Then, we utilize a privacy-preserving multi-party computation protocol to guar-antee that the data flows of the distributed computation scheme do not pose new privacy risks. Finally, we enforce differential privacy to preempt inference threats against the agents' training data-past environment interactions-when they take actions based on their neural network predictions. We implement PE-VDN in StarCraft Multi-Agent Competition (SMAC) and show that it achieves 80% of Vanilla VDN's win rate while maintaining differential privacy levels that provide meaningful privacy guarantees. The results demonstrate that PE-VDN can safeguard the confidentiality of agents' environment interaction data without sacrificing multi-agent coordination. more »

Award ID(s):: 1943275

NSF-PAR ID:: 10494460

Author(s) / Creator(s):: Gohari, Parham; Hale, Matthew; Topcu, Ufuk

Publisher / Repository:: IEEE

Date Published:: 2023-12-13

Journal Name:: Proceedings of the 62nd IEEE Conference on Decision and Control (CDC)

Page Range / eLocation ID:: 8038 to 8044

Format(s):: Medium: X

Location:: Singapore, Singapore

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
This content will become publicly available on December 13, 2024
Conference Paper:
https://doi.org/10.1109/CDC49753.2023.10384184

More Like this