skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Award ID contains: 2149470

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. In this article, we study linearly constrained policy optimization over the manifold of Schur stabilizing controllers, equipped with a Riemannian metric that emerges naturally in the context of optimal control problems. We provide extrinsic analysis of a generic constrained smooth cost function that subsequently facilitates subsuming any such constrained problem into this framework. By studying the second-order geometry of this manifold, we provide a Newton-type algorithm that does not rely on the exponential mapping nor a retraction, while ensuring local convergence guarantees. The algorithm hinges instead upon the developed stability certificate and the linear structure of the constraints. We then apply our methodology to two well-known constrained optimal control problems. Finally, several numerical examples showcase the performance of the proposed algorithm. 
    more » « less
  2. In this paper, we consider direct policy optimization for the linear-quadratic Gaussian (LQG) setting. Over the past few years, it has been recognized that the landscape of stabilizing output-feedback controllers of relevance to LQG has an intricate geometry, particularly as it pertains to the existence of spurious stationary points. In order to address such challenges, in this paper, we first adopt a Riemannian metric for the space of stabilizing full-order minimal output-feedback controllers. We then proceed to prove that the orbit space of such controllers modulo coordinate transformation admits a Riemannian quotient manifold structure. This geometric structure is then used to develop a Riemannian gradient descent for the direct LQG policy optimization. We prove a local convergence guarantee with linear rate and show the proposed approach exhibits significantly faster and more robust numerical performance as compared with ordinary gradient descent for LQG. Subsequently, we provide reasons for this observed behavior; in particular, we argue that optimizing over the orbit space of controllers is the right theoretical and computational setup for direct LQG policy optimization. 
    more » « less
  3. Control of networked systems, comprised of interacting agents, is often achieved through modeling the underlying interactions. Constructing accurate models of such interactions–in the meantime–can become prohibitive in applications. Data-driven control methods avoid such complications by directly synthesizing a controller from the observed data. In this paper, we propose an algorithm referred to as Data-driven Structured Policy Iteration (D2SPI), for synthesizing an efficient feedback mechanism that respects the sparsity pattern induced by the underlying interaction network. In particular, our algorithm uses temporary “auxiliary” communication links in order to enable the required information exchange on a (smaller) sub-network during the “learning phase”—links that will be removed subsequently for the final distributed feedback synthesis. We then proceed to show that the learned policy results in a stabilizing structured policy for the entire network. Our analysis is then followed by showing the stability and convergence of the proposed distributed policies throughout the learning phase, exploiting a construct referred to as the “Patterned monoid.” The performance of D2SPI is then demonstrated using representative simulation scenarios. 
    more » « less
  4. In this paper, we develop a distributed consensus algorithm for agents whose states evolve on a manifold. This algorithm is complementary to traditional consensus, predominantly developed for systems with dynamics on vector spaces. We provide theoretical convergence guarantees for the proposed manifold consensus provided that agents are initialized within a geodesically convex (g-convex) set. This required condition on initialization is not restrictive as g-convex sets may be comparatively “large” for relevant Riemannian manifolds. Our approach to manifold consensus builds upon the notion of Riemannian Center of Mass (RCM) and the intrinsic structure of the manifold to avoid projections in the ambient space. We first show that on a g-convex ball, all states coincide if and only if each agent’s state is the RCM of its neighbors’ states. This observation facilitates our convergence guarantee to the consensus submanifold. Finally, we provide simulation results that exemplify the linear convergence rate of the proposed algorithm and illustrates its statistical properties over randomly generated problem instances. 
    more » « less
  5. Gradient-based methods have been widely used for system design and optimization in diverse application domains. Recently, there has been a renewed interest in studying theoretical properties of these methods in the context of control and reinforcement learning. This article surveys some of the recent developments on policy optimization, a gradient-based iterative approach for feedback control synthesis that has been popularized by successes of reinforcement learning. We take an interdisciplinary perspective in our exposition that connects control theory, reinforcement learning, and large-scale optimization. We review a number of recently developed theoretical results on the optimization landscape, global convergence, and sample complexityof gradient-based methods for various continuous control problems, such as the linear quadratic regulator (LQR), [Formula: see text] control, risk-sensitive control, linear quadratic Gaussian (LQG) control, and output feedback synthesis. In conjunction with these optimization results, we also discuss how direct policy optimization handles stability and robustness concerns in learning-based control, two main desiderata in control engineering. We conclude the survey by pointing out several challenges and opportunities at the intersection of learning and control. 
    more » « less
  6. Duality of control and estimation allows mapping recent advances in data-guided control to the estimation setup. This paper formalizes and utilizes such a mapping to consider learning the optimal (steady-state) Kalman gain when process and measurement noise statistics are unknown. Specifically, building on the duality between synthesizing optimal control and estimation gains, the filter design problem is formalized as direct policy learning. In this direction, the duality is used to extend existing theoretical guarantees of direct policy updates for Linear Quadratic Regulator (LQR) to establish global convergence of the Gradient Descent (GD) algorithm for the estimation problem–while addressing subtle differences between the two synthesis problems. Subsequently, a Stochastic Gradient Descent (SGD) approach is adopted to learn the optimal Kalman gain without the knowledge of noise covariances. The results are illustrated via several numerical examples. 
    more » « less
  7. Abstract—In this paper, we consider policy optimization over the Riemannian submanifolds of stabilizing controllers arising from constrained Linear Quadratic Regulators (LQR), including output feedback and structured synthesis. In this direction, we provide a Riemannian Newton-type algorithm that enjoys local convergence guarantees and exploits the inherent geometry of the problem. Instead of relying on the exponential mapping or a global retraction, the proposed algorithm revolves around the developed stability certificate and the constraint structure, utilizing the intrinsic geometry of the synthesis problem. We then showcase the utility of the proposed algorithm through numerical examples. 
    more » « less