skip to main content


Search for: All records

Creators/Authors contains: "Han, Shuo"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Various methods for Multi-Agent Reinforcement Learning (MARL) have been developed with the assumption that agents’ policies are based on accurate state information. However, policies learned through Deep Reinforcement Learning (DRL) are susceptible to adversarial state perturbation attacks. In this work, we propose a State-Adversarial Markov Game (SAMG) and make the first attempt to investigate different solution concepts of MARL under state uncertainties. Our analysis shows that the commonly used solution concepts of optimal agent policy and robust Nash equilibrium do not always exist in SAMGs. To circumvent this difficulty, we consider a new solution concept called robust agent policy, where agents aim to maximize the worst-case expected state value. We prove the existence of robust agent policy for finite state and finite action SAMGs. Additionally, we propose a Robust Multi-Agent Adversarial Actor-Critic (RMA3C) algorithm to learn robust policies for MARL agents under state uncertainties. Our experiments demonstrate that our algorithm outperforms existing methods when faced with state perturbations and greatly improves the robustness of MARL policies. Our code is public on https://songyanghan.github.io/what_is_solution/. 
    more » « less
    Free, publicly-accessible full text available February 9, 2025
  2. Fu, J. (Ed.)
    Free, publicly-accessible full text available December 23, 2024
  3. Free, publicly-accessible full text available January 1, 2025
  4. This letter focuses on the optimal allocation of multi-stage attacks with the uncertainty in attacker’s intention. We model the attack planning problem using a Markov decision process and characterize the uncertainty in the attacker’s intention using a finite set of reward functions—each reward represents a type of attacker. Based on this modeling, we employ the paradigm of the worst-case absolute regret minimization from robust game theory and develop mixed-integer linear program (MILP) formulations for solving the worst-case regret minimizing sensor allocation strategies for two classes of attack-defend interactions: one where the defender and attacker engage in a zero-sum game and another where they engage in a non-zero-sum game. We demonstrate the effectiveness of our algorithm using a stochastic gridworld example. 
    more » « less