The authors provide the first tight sample complexity bounds for shadow tomography and classical shadows in the regime where the target error is below some sufficiently small inverse polynomial in the dimension of the Hilbert space. Specifically, they present a protocol that, given any π β π mβN and π β€ π ( π β 1 / 2 ) Ο΅β€O(d β1/2 ), measures π ( log β‘ ( π ) / π 2 ) O(log(m)/Ο΅ 2 ) copies of an unknown mixed state π β πΆ π Γ π ΟβC dΓd and outputs a classical description of π Ο. This description can then be used to estimate any collection of π m observables to within additive accuracy π Ο΅. Previously, even for the simpler case of shadow tomography where observables are known in advance, the best known rates either scaled benignly but suboptimally in all of π , π , π m,d,Ο΅, or scaled optimally in π , π Ο΅,m but included additional polynomial factors in π d. Interestingly, the authors also show via dimensionality reduction that one can rescale π Ο΅ and π d to reduce to the regime where π β€ π ( π β 1 / 2 ) Ο΅β€O(d β1/2 ). Their algorithm draws on representation-theoretic tools developed in the context of full state tomography.
more »
« less
This content will become publicly available on June 30, 2026
Scalable natural policy gradient for general-sum linear quadratic games with known parameters
Consider a general-sum N-player linear-quadratic (LQ) game with stochastic dynamics over a finite time horizon. It is known that under some mild assumptions, the Nash equilibrium (NE) strategies for the players can be obtained by a natural policy gradient algorithm. However, the traditional implementation of the algorithm requires the availability of complete state and action information from all agents and may not scale well with the number of agents. Under the assumption of known problem parameters, we present an algorithm that assumes state and action information from only neighboring agents according to the graph describing the dynamic or cost coupling among the agents. We show that the proposed algorithm converges to an π-neighborhood of the NE where the value of π depends on the size of the local neighborhood of agents.
more »
« less
- Award ID(s):
- 2300355
- PAR ID:
- 10626313
- Publisher / Repository:
- Proceedings of the 7th Annual Learning for Dynamics \& Control Conference, PMLR 283:139-152, 2025.
- Date Published:
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Various methods for Multi-Agent Reinforcement Learning (MARL) have been developed with the assumption that agentsβ policies are based on accurate state information. However, policies learned through Deep Reinforcement Learning (DRL) are susceptible to adversarial state perturbation attacks. In this work, we propose a State-Adversarial Markov Game (SAMG) and make the first attempt to investigate different solution concepts of MARL under state uncertainties. Our analysis shows that the commonly used solution concepts of optimal agent policy and robust Nash equilibrium do not always exist in SAMGs. To circumvent this difficulty, we consider a new solution concept called robust agent policy, where agents aim to maximize the worst-case expected state value. We prove the existence of robust agent policy for finite state and finite action SAMGs. Additionally, we propose a Robust Multi-Agent Adversarial Actor-Critic (RMA3C) algorithm to learn robust policies for MARL agents under state uncertainties. Our experiments demonstrate that our algorithm outperforms existing methods when faced with state perturbations and greatly improves the robustness of MARL policies. Our code is public on https://songyanghan.github.io/what_is_solution/.more » « less
-
Various methods for Multi-Agent Reinforcement Learning (MARL) have been developed with the assumption that agents' policies are based on accurate state information. However, policies learned through Deep Reinforcement Learning (DRL) are susceptible to adversarial state perturbation attacks. In this work, we propose a State-Adversarial Markov Game (SAMG) and make the first attempt to investigate different solution concepts of MARL under state uncertainties. Our analysis shows that the commonly used solution concepts of optimal agent policy and robust Nash equilibrium do not always exist in SAMGs. To circumvent this difficulty, we consider a new solution concept called robust agent policy, where agents aim to maximize the worst-case expected state value. We prove the existence of robust agent policy for finite state and finite action SAMGs. Additionally, we propose a Robust Multi-Agent Adversarial Actor-Critic (RMA3C) algorithm to learn robust policies for MARL agents under state uncertainties. Our experiments demonstrate that our algorithm outperforms existing methods when faced with state perturbations and greatly improves the robustness of MARL policies. Our code is public on https://songyanghan.github.io/what_is_solution/.more » « less
-
null (Ed.)This article considers resilient cooperative state estimation in unreliable multiagent networks. A network of agents aim to collaboratively estimate the value of an unknown vector parameter, while an unknown subset of agents suffer Byzantine faults. We refer to the faulty agents as Byzantine agents. Byzantine agents malfunction arbitrarily and may send out highly unstructured messages to other agents in the network. As opposed to fault-free networks, reaching agreement in the presence of Byzantine agents is far from trivial. In this article, we propose a computationally efficient algorithm that is provably robust to Byzantine agents. At each iteration of the algorithm, a good agent performs a gradient descent update based on noisy local measurements, exchanges its update with other agents in its neighborhood, and robustly aggregates the received messages using coordinate-wise trimmed means. Under mild technical assumptions, we establish that good agents learn the true parameter asymptotically in almost sure sense. We further complement our analysis by proving (high probability) finite-time convergence rate, encapsulating network characteristics.more » « less
-
We study reinforcement learning (RL) in a setting with a network of agents whose states and actions interact in a local manner where the objective is to find localized policies such that the (discounted) global reward is maximized. A fundamental challenge in this setting is that the state-action space size scales exponentially in the number of agents, rendering the problem intractable for large networks. In this paper, we propose a scalable actor critic (SAC) framework that exploits the network structure and finds a localized policy that is an [Formula: see text]-approximation of a stationary point of the objective for some [Formula: see text], with complexity that scales with the local state-action space size of the largest [Formula: see text]-hop neighborhood of the network. We illustrate our model and approach using examples from wireless communication, epidemics, and traffic.more » « less
An official website of the United States government
