skip to main content

Title: Optimal Transport in Systems and Control
Optimal transport began as the problem of how to efficiently redistribute goods between production and consumers and evolved into a far-reaching geometric variational framework for studying flows of distributions on metric spaces. This theory enables a class of stochastic control problems to regulate dynamical systems so as to limit uncertainty to within specified limits. Representative control examples include the landing of a spacecraft aimed probabilistically toward a target and the suppression of undesirable effects of thermal noise on resonators; in both of these examples, the goal is to regulate the flow of the distribution of the random state. A most unlikely link turned up between transport of probability distributions and a maximum entropy inference problem posed by Erwin Schrödinger, where the latter is seen as an entropy-regularized version of the former. These intertwined topics of optimal transport, stochastic control, and inference are the subject of this review, which aims to highlight connections, insights, and computational tools while touching on quadratic regulator theory and probabilistic flows in discrete spaces and networks. Expected final online publication date for the Annual Review of Control, Robotics, and Autonomous Systems, Volume 4 is May 2021. Please see for revised estimates.  more » « less
Award ID(s):
1942523 1901599 1807664 1839441
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Annual Review of Control, Robotics, and Autonomous Systems
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We consider damped stochastic systems in a controlled (time-varying) potential and study their transition between specified Gibbs-equilibria states in finite time. By the second law of thermody- namics, the minimum amount of work needed to transition from one equilibrium state to another is the difference between the Helmholtz free energy of the two states and can only be achieved by a reversible (infinitely slow) process. The minimal gap between the work needed in a finite-time transition and the work during a reversible one, turns out to equal the square of the optimal mass transport (Wasserstein- 2) distance between the two end-point distributions times the inverse of the duration needed for the transition. This result, in fact, relates non-equilibrium optimal control strategies (protocols) to gradient flows of entropy functionals via the Jordan-Kinderlehrer-Otto scheme. The purpose of this paper is to introduce ideas and results from the emerging field of stochastic thermodynamics in the setting of classical regulator theory, and to draw connections and derive such fundamental relations from a control perspective in a multivariable setting. 
    more » « less
  2. Our goal is to learn control policies for robots that provably generalize well to novel environments given a dataset of example environments. The key technical idea behind our approach is to leverage tools from generalization theory in machine learning by exploiting a precise analogy (which we present in the form of a reduction) between generalization of control policies to novel environments and generalization of hypotheses in the supervised learning setting. In particular, we utilize the probably approximately correct (PAC)-Bayes framework, which allows us to obtain upper bounds that hold with high probability on the expected cost of (stochastic) control policies across novel environments. We propose policy learning algorithms that explicitly seek to minimize this upper bound. The corresponding optimization problem can be solved using convex optimization (relative entropy programming in particular) in the setting where we are optimizing over a finite policy space. In the more general setting of continuously parameterized policies (e.g., neural network policies), we minimize this upper bound using stochastic gradient descent. We present simulated results of our approach applied to learning (1) reactive obstacle avoidance policies and (2) neural network-based grasping policies. We also present hardware results for the Parrot Swing drone navigating through different obstacle environments. Our examples demonstrate the potential of our approach to provide strong generalization guarantees for robotic systems with continuous state and action spaces, complicated (e.g., nonlinear) dynamics, rich sensory inputs (e.g., depth images), and neural network-based policies.

    more » « less
  3. null (Ed.)
    Abstract We consider stochastic systems of interacting particles or agents, with dynamics determined by an interaction kernel, which only depends on pairwise distances. We study the problem of inferring this interaction kernel from observations of the positions of the particles, in either continuous or discrete time, along multiple independent trajectories. We introduce a nonparametric inference approach to this inverse problem, based on a regularized maximum likelihood estimator constrained to suitable hypothesis spaces adaptive to data. We show that a coercivity condition enables us to control the condition number of this problem and prove the consistency of our estimator, and that in fact it converges at a near-optimal learning rate, equal to the min–max rate of one-dimensional nonparametric regression. In particular, this rate is independent of the dimension of the state space, which is typically very high. We also analyze the discretization errors in the case of discrete-time observations, showing that it is of order 1/2 in terms of the time spacings between observations. This term, when large, dominates the sampling error and the approximation error, preventing convergence of the estimator. Finally, we exhibit an efficient parallel algorithm to construct the estimator from data, and we demonstrate the effectiveness of our algorithm with numerical tests on prototype systems including stochastic opinion dynamics and a Lennard-Jones model. 
    more » « less
  4. The Gromov-Wasserstein (GW) formalism can be seen as a generalization of the optimal transport (OT) formalism for comparing two distributions associated with different metric spaces. It is a quadratic optimization problem and solving it usually has computational costs that can rise sharply if the problem size exceeds a few hundred points. Recently fast techniques based on entropy regularization have being developed to solve an approximation of the GW problem quickly. There are issues, however, with the numerical convergence of those regularized approximations to the true GW solution. To circumvent those issues, we introduce a novel strategy to solve the discrete GW problem using methods taken from statistical physics. We build a temperature-dependent free energy function that reflects the GW problem’s constraints. To account for possible differences of scales between the two metric spaces, we introduce a scaling factor s in the definition of the energy. From the extremum of the free energy, we derive a mapping between the two probability measures that are being compared, as well as a distance between those measures. This distance is equal to the GW distance when the temperature goes to zero. The optimal scaling factor itself is obtained by minimizing the free energy with respect to s. We illustrate our approach on the problem of comparing shapes defined by unstructured triangulations of their surfaces. We use several synthetic and “real life” datasets. We demonstrate the accuracy and automaticity of our approach in non-rigid registration of shapes. We provide numerical evidence that there is a strong correlation between the GW distances computed from low-resolution, surface-based representations of proteins and the analogous distances computed from atomistic models of the same proteins. 
    more » « less
  5. Abstract

    An adaptive, adversarial methodology is developed for the optimal transport problem between two distributions $\mu $ and $\nu $, known only through a finite set of independent samples $(x_i)_{i=1..n}$ and $(y_j)_{j=1..m}$. The methodology automatically creates features that adapt to the data, thus avoiding reliance on a priori knowledge of the distributions underlying the data. Specifically, instead of a discrete point-by-point assignment, the new procedure seeks an optimal map $T(x)$ defined for all $x$, minimizing the Kullback–Leibler divergence between $(T(x_i))$ and the target $(y_j)$. The relative entropy is given a sample-based, variational characterization, thereby creating an adversarial setting: as one player seeks to push forward one distribution to the other, the second player develops features that focus on those areas where the two distributions fail to match. The procedure solves local problems that seek the optimal transfer between consecutive, intermediate distributions between $\mu $ and $\nu $. As a result, maps of arbitrary complexity can be built by composing the simple maps used for each local problem. Displaced interpolation is used to guarantee global from local optimality. The procedure is illustrated through synthetic examples in one and two dimensions.

    more » « less