Enforcing Signal Temporal Logic Specifications in Multi-Agent Adversarial Environments: A Deep Q-Learning Approach

Muniraj, D; Vamvoudakis, K. G; Farhood, M.

doi:10.1109/CDC.2018.8618746

Citation Details

Enforcing Signal Temporal Logic Specifications in Multi-Agent Adversarial Environments: A Deep Q-Learning Approach

This work addresses the problem of learning optimal control policies for a multi-agent system in an adversarial environment. Specifically, we focus on multi-agent systems where the mission objectives are expressed as signal temporal logic (STL) specifications. The agents are classified as either defensive or adversarial. The defensive agents are maximizers, namely, they maximize an objective function that enforces the STL specification; the adversarial agents, on the other hand, are minimizers. The interaction among the agents is modeled as a finite-state team stochastic game with an unknown transition probability function. The synthesis objective is to determine optimal control policies for the defensive agents that implement the STL specification against the best responses of the adversarial agents. A multi-agent deep Q-learning algorithm, which is an extension of the minimax Q-learning algorithm, is then proposed to learn the optimal policies. The effectiveness of the proposed approach is illustrated through a simulation case study. more »

Award ID(s):: 1851588 1650465

PAR ID:: 10083706

Author(s) / Creator(s):: Muniraj, D; Vamvoudakis, K. G; Farhood, M.

Date Published:: 2019-01-21

Journal Name:: 2018 IEEE Conference on Decision and Control (CDC)

Page Range / eLocation ID:: 4141-4146

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.1109/CDC.2018.8618746

More Like this