In this paper, we consider direct policy optimization for the linear-quadratic Gaussian (LQG) setting. Over the past few years, it has been recognized that the landscape of stabilizing output-feedback controllers of relevance to LQG has an intricate geometry, particularly as it pertains to the existence of spurious stationary points. In order to address such challenges, in this paper, we first adopt a Riemannian metric for the space of stabilizing full-order minimal output-feedback controllers. We then proceed to prove that the orbit space of such controllers modulo coordinate transformation admits a Riemannian quotient manifold structure. This geometric structure is then used to develop a Riemannian gradient descent for the direct LQG policy optimization. We prove a local convergence guarantee with linear rate and show the proposed approach exhibits significantly faster and more robust numerical performance as compared with ordinary gradient descent for LQG. Subsequently, we provide reasons for this observed behavior; in particular, we argue that optimizing over the orbit space of controllers is the right theoretical and computational setup for direct LQG policy optimization.
more »
« less
On Controller Reduction in Linear Quadratic Gaussian Control with Performance Bounds
The problem of controller reduction has a rich history in control theory. Yet, many questions remain open. In particular, there exist very few results on the order reduction of general non-observer based controllers and the subsequent quantification of the closed-loop performance. Recent developments in model-free policy optimization for Linear Quadratic Gaussian (LQG) control have highlighted the importance of this question. In this paper, we first propose a new set of sufficient conditions ensuring that a perturbed controller remains internally stabilizing. Based on this result, we illustrate how to perform order reduction of general (non-observer based) output feedback controllers using balanced truncation and modal truncation. We also provide explicit bounds on the LQG performance of the reduced-order controller. Furthermore, for single-input-single-output (SISO) systems, we introduce a new controller reduction technique by truncating unstable modes. We illustrate our theoretical results with numerical simulations. Our results will serve as valuable tools to design direct policy search algorithms for control problems with partial observations.
more »
« less
- PAR ID:
- 10443278
- Date Published:
- Journal Name:
- Proceedings of Machine Learning Research
- Volume:
- 211
- ISSN:
- 2640-3498
- Page Range / eLocation ID:
- 1008-1019
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
The work provides a general model of communication attacks on a networked infinite dimensional system. The system employs a network of inexpensive control units consisting of actuators, sensors and control processors. In an effort to replace a reduced number of expensive high-end actuating and sensing devices implementing an observer-based feedback, the alternate is to use multiple inexpensive actuators/sensors with static output feedback. In order to emulate the performance of the high-end devices, the controllers for the multiple actuator/sensors implement controllers which render the system networked. In doing so, they become prone to communication attacks either as accidental or deliberate actions on the connectivity of the control nodes. A single attack function is proposed which models all types of communication attacks and an adaptive detection scheme is proposed in order to (i) detect the presence of an attack, (ii) diagnose the attack and (iii) accommodate the attack via an appropriate control reconfiguration. The reconfiguration employs the adaptive estimates of the controller gains and restructure the controller adaptively in order to minimize the detrimental effects of the attack on closed-loop performance. Numerical studies on a 1D diffusion PDE employing networked actuator/sensor pairs are included in order to further convey the special architecture of detection and accommodation of networked systems under communication attacks.more » « less
-
This paper addresses the end-to-end sample complexity bound for learning the H2 optimal controller (the Linear Quadratic Gaussian (LQG) problem) with unknown dynamics, for potentially unstable Linear Time Invariant (LTI) systems. The robust LQG synthesis procedure is performed by considering bounded additive model uncertainty on the coprime factors of the plant. The closed-loopi dentification of the nominal model of the true plant is performed by constructing a Hankel-like matrix from a single time-series of noisy finite length input-output data, using the ordinary least squares algorithm from Sarkar and Rakhlin (2019). Next, an H∞ bound on the estimated model error is provided and the robust controller is designed via convex optimization, much in the spirit of Mania et al. (2019) and Zheng et al. (2020b), while allowing for bounded additive uncertainty on the coprime factors of the model. Our conclusions are consistent with previous results on learning the LQG and LQR controllers.more » « less
-
This paper revisits the design of compensator-based controller for a class of infinite dimensional systems. In order to save computational time, a functional observer is employed to reconstruct a functional of the state which coincides with the full state feedback control signal. Such a full-state feedback corresponds to an idealized case wherein the state is available. Instead of reconstructing the entire state via a state-observer and then use this state estimate in a controller expression, a functional observer is used to estimate the product of the state and the feedback operator, thus resulting in a significant reduction in computational load. This observer design is subsequently integrated with a sensor selection in order to improve controller performance. An appropriate metric is used to optimize the sensor location resulting in improved performance of the functional observer-based compensator. The integrated design is further extended to include a controller with an unknown input functional observer. The results are applied to 2D partial differential equations and detailed numerical studies are included to provide an appreciation in the significant savings in both operational and computational costs.more » « less
-
null (Ed.)The ubiquitous usage of communication networks in modern sensing and control applications has kindled new interests on the timing-based coordination between sensors and controllers, i.e., how to use the “waiting time” to improve the system performance. Contrary to the common belief that a zero-wait policy is always optimal, Sun et al. showed that a controller can strictly improve the data freshness, the so-called Age-of-Information (AoI), by postponing transmission in order to lengthen the duration of staying in a good state. The optimal waiting policy for the sensor side was later characterized in the context of remote estimation. Instead of focusing on the sensor and controller sides separately, this work develops the optimal joint sensor/controller waiting policy in a Wiener-process system. The results can be viewed as strict generalization of the above two important results in the sense that not only do we consider joint sensor/controller designs (as opposed to sensor-only or controller only schemes), but we also assume random delay in both the forward and feedback directions (as opposed to random delay in only one direction). In addition to provable optimality, extensive simulation is used to verify the performance of the proposed scheme in various settings.more » « less
An official website of the United States government

