skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Optimal Control of a Large Population of Randomly Interrogated Interacting Agents
This article investigates a stochastic optimal control problem with linear Gaussian dynamics, quadratic performance measure, but non-Gaussian observations. The linear Gaussian dynamics characterizes a large number of interacting agents evolving under a centralized control and external disturbances. The aggregate state of the agents is only partially known to the centralized controller by means of the samples taken randomly in time and from anonymous randomly selected agents. Due to removal of the agent identity from the samples, the observation set has a non-Gaussian structure, and as a consequence, the optimal control law that minimizes a quadratic cost is essentially nonlinear and infinite-dimensional, for any finite number of agents. For infinitely many agents, however, this paper shows that the optimal control law is the solution to a reduced order, finite-dimensional linear quadratic Gaussian problem with Gaussian observations sampled only in time. For this problem, the separation principle holds and is used to develop an explicit optimal control law by combining a linear quadratic regulator with a separately designed finite-dimensional minimum mean square error state estimator. Conditions are presented under which this simple optimal control law can be adopted as a suboptimal control law for finitely many agents.  more » « less
Award ID(s):
1941944
PAR ID:
10481061
Author(s) / Creator(s):
Publisher / Repository:
IEEE
Date Published:
Journal Name:
IEEE Transactions on Automatic Control
Volume:
68
Issue:
7
ISSN:
0018-9286
Page Range / eLocation ID:
4079 - 4095
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Consider a general-sum N-player linear-quadratic (LQ) game with stochastic dynamics over a finite time horizon. It is known that under some mild assumptions, the Nash equilibrium (NE) strategies for the players can be obtained by a natural policy gradient algorithm. However, the traditional implementation of the algorithm requires the availability of complete state and action information from all agents and may not scale well with the number of agents. Under the assumption of known problem parameters, we present an algorithm that assumes state and action information from only neighboring agents according to the graph describing the dynamic or cost coupling among the agents. We show that the proposed algorithm converges to an 𝜖-neighborhood of the NE where the value of 𝜖 depends on the size of the local neighborhood of agents. 
    more » « less
  2. This paper proposes a novel learning-based adaptive optimal controller design method for a class of continuous-time linear time-delay systems. A key strategy is to exploit the state-of-the-art reinforcement learning (RL) techniques and adaptive dynamic programming (ADP), and propose a data-driven method to learn the near-optimal controller without the precise knowledge of system dynamics. Specifically, a value iteration (VI) algorithm is proposed to solve the infinite-dimensional Riccati equation for the linear quadratic optimal control problem of time-delay systems using finite samples of input-state trajectory data. It is rigorously proved that the proposed VI algorithm converges to the near-optimal solution. Compared with the previous literature, the nice features of the proposed VI algorithm are that it is directly developed for continuous-time systems without discretization and an initial admissible controller is not required for implementing the algorithm. The efficacy of the proposed methodology is demonstrated by two practical examples of metal cutting and autonomous driving. 
    more » « less
  3. We consider a multi-agent linear quadratic optimal control problem. Due to communication constraints, the agents are required to quantize their local state measurements before communicating them to the rest of the team, thus resulting in a decentralized information structure. The optimal controllers are to be synthesized under this decentralized and quantized information structure. The agents are given a set of quantizers with varying quantization resolutions—higher resolution incurs higher communication cost and vice versa. The team must optimally select the quantizer to prioritize agents with ‘highquality’ information for optimizing the control performance under communication constraints. We show that there exist a sepatation between the optimal solution to the control problem and the choice of the optimal quantizer. We show that the optimal controllers are linear and the optimal selection of the quantizers can be determined by solving a linear program. 
    more » « less
  4. We study the task of learning state representations from potentially high-dimensional observations, with the goal of controlling an unknown partially observable system. We pursue a direct latent model learning approach, where a dynamic model in some latent state space is learned by predicting quantities directly related to planning (e.g., costs) without reconstructing the observations. In particular, we focus on an intuitive cost-driven state representation learning method for solving Linear Quadratic Gaussian (LQG) control, one of the most fundamental partially observable control problems. As our main results, we establish finite-sample guarantees of finding a near-optimal state representation function and a near-optimal controller using the directly learned latent model. To the best of our knowledge, despite various empirical successes, prior to this work it was unclear if such a cost-driven latent model learner enjoys finite-sample guarantees. Our work underscores the value of predicting multi-step costs, an idea that is key to our theory, and notably also an idea that is known to be empirically valuable for learning state representations. 
    more » « less
  5. In feedback control of dynamical systems, the choice of a higher loop gain is typically desirable to achieve a faster closed-loop dynamics, smaller tracking error, and more effective disturbance suppression. Yet, an increased loop gain requires a higher control effort, which can extend beyond the actuation capacity of the feedback system and intermittently cause actuator saturation. To benefit from the advantages of a high feedback gain and simultaneously avoid actuator saturation, this paper advocates a dynamic gain adaptation technique in which the loop gain is lowered whenever necessary to prevent actuator saturation, and is raised again whenever possible. This concept is optimized for linear systems based on an optimal control formulation inspired by the notion of linear quadratic regulator (LQR). The quadratic cost functional adopted in LQR is modified into a certain quasi-quadratic form in which the control cost is dynamically emphasized or deemphasized as a function of the system state. The optimal control law resulted from this quasi-quadratic cost functional is essentially nonlinear, but its structure resembles an LQR with an adaptable gain adjusted by the state of system, aimed to prevent actuator saturation. Moreover, under mild assumptions analogous to those of LQR, this optimal control law is stabilizing. As an illustrative example, application of this optimal control law in feedback design for dc servomotors is examined, and its performance is verified by numerical simulations. 
    more » « less