Abstract—In this paper, we consider policy optimization over the Riemannian submanifolds of stabilizing controllers arising from constrained Linear Quadratic Regulators (LQR), including output feedback and structured synthesis. In this direction, we provide a Riemannian Newton-type algorithm that enjoys local convergence guarantees and exploits the inherent geometry of the problem. Instead of relying on the exponential mapping or a global retraction, the proposed algorithm revolves around the developed stability certificate and the constraint structure, utilizing the intrinsic geometry of the synthesis problem. We then showcase the utility of the proposed algorithm through numerical examples.
more »
« less
Output-feedback Synthesis Orbit Geometry: Quotient Manifolds and LQG Direct Policy Optimization
In this paper, we consider direct policy optimization for the linear-quadratic Gaussian (LQG) setting. Over the past few years, it has been recognized that the landscape of stabilizing output-feedback controllers of relevance to LQG has an intricate geometry, particularly as it pertains to the existence of spurious stationary points. In order to address such challenges, in this paper, we first adopt a Riemannian metric for the space of stabilizing full-order minimal output-feedback controllers. We then proceed to prove that the orbit space of such controllers modulo coordinate transformation admits a Riemannian quotient manifold structure. This geometric structure is then used to develop a Riemannian gradient descent for the direct LQG policy optimization. We prove a local convergence guarantee with linear rate and show the proposed approach exhibits significantly faster and more robust numerical performance as compared with ordinary gradient descent for LQG. Subsequently, we provide reasons for this observed behavior; in particular, we argue that optimizing over the orbit space of controllers is the right theoretical and computational setup for direct LQG policy optimization.
more »
« less
- Award ID(s):
- 2149470
- PAR ID:
- 10516485
- Publisher / Repository:
- arxiv
- Date Published:
- Journal Name:
- arxiv
- ISSN:
- 2331-8422
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
The problem of controller reduction has a rich history in control theory. Yet, many questions remain open. In particular, there exist very few results on the order reduction of general non-observer based controllers and the subsequent quantification of the closed-loop performance. Recent developments in model-free policy optimization for Linear Quadratic Gaussian (LQG) control have highlighted the importance of this question. In this paper, we first propose a new set of sufficient conditions ensuring that a perturbed controller remains internally stabilizing. Based on this result, we illustrate how to perform order reduction of general (non-observer based) output feedback controllers using balanced truncation and modal truncation. We also provide explicit bounds on the LQG performance of the reduced-order controller. Furthermore, for single-input-single-output (SISO) systems, we introduce a new controller reduction technique by truncating unstable modes. We illustrate our theoretical results with numerical simulations. Our results will serve as valuable tools to design direct policy search algorithms for control problems with partial observations.more » « less
-
Gradient-based methods have been widely used for system design and optimization in diverse application domains. Recently, there has been a renewed interest in studying theoretical properties of these methods in the context of control and reinforcement learning. This article surveys some of the recent developments on policy optimization, a gradient-based iterative approach for feedback control synthesis that has been popularized by successes of reinforcement learning. We take an interdisciplinary perspective in our exposition that connects control theory, reinforcement learning, and large-scale optimization. We review a number of recently developed theoretical results on the optimization landscape, global convergence, and sample complexityof gradient-based methods for various continuous control problems, such as the linear quadratic regulator (LQR), [Formula: see text] control, risk-sensitive control, linear quadratic Gaussian (LQG) control, and output feedback synthesis. In conjunction with these optimization results, we also discuss how direct policy optimization handles stability and robustness concerns in learning-based control, two main desiderata in control engineering. We conclude the survey by pointing out several challenges and opportunities at the intersection of learning and control.more » « less
-
In this article, we study linearly constrained policy optimization over the manifold of Schur stabilizing controllers, equipped with a Riemannian metric that emerges naturally in the context of optimal control problems. We provide extrinsic analysis of a generic constrained smooth cost function that subsequently facilitates subsuming any such constrained problem into this framework. By studying the second-order geometry of this manifold, we provide a Newton-type algorithm that does not rely on the exponential mapping nor a retraction, while ensuring local convergence guarantees. The algorithm hinges instead upon the developed stability certificate and the linear structure of the constraints. We then apply our methodology to two well-known constrained optimal control problems. Finally, several numerical examples showcase the performance of the proposed algorithm.more » « less
-
In this paper, we study the generalized subdifferentials and the Riemannian gradient subconsistency that are the basis for non-Lipschitz optimization on embedded submanifolds of [Formula: see text]. We then propose a Riemannian smoothing steepest descent method for non-Lipschitz optimization on complete embedded submanifolds of [Formula: see text]. We prove that any accumulation point of the sequence generated by the Riemannian smoothing steepest descent method is a stationary point associated with the smoothing function employed in the method, which is necessary for the local optimality of the original non-Lipschitz problem. We also prove that any accumulation point of the sequence generated by our method that satisfies the Riemannian gradient subconsistency is a limiting stationary point of the original non-Lipschitz problem. Numerical experiments are conducted to demonstrate the advantages of Riemannian [Formula: see text] [Formula: see text] optimization over Riemannian [Formula: see text] optimization for finding sparse solutions and the effectiveness of the proposed method. Funding: C. Zhang was supported in part by the National Natural Science Foundation of China [Grant 12171027] and the Natural Science Foundation of Beijing [Grant 1202021]. X. Chen was supported in part by the Hong Kong Research Council [Grant PolyU15300219]. S. Ma was supported in part by the National Science Foundation [Grants DMS-2243650 and CCF-2308597], the UC Davis Center for Data Science and Artificial Intelligence Research Innovative Data Science Seed Funding Program, and a startup fund from Rice University.more » « less
An official website of the United States government

