skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Lyapunov Neural Network with Region of Attraction Search
Deep learning methods have been widely used in robotic applications, making learning-enabled control design for complex nonlinear systems a promising direction. Although deep reinforcement learning methods have demonstrated impressive empirical performance, they lack the stability guarantees that are important in safety-critical situations. One way to provide these guarantees is to learn Lyapunov certificates alongside control policies. There are three related problems: 1) verify that a given Lyapunov function candidate satisfies the conditions for a given controller on a region, 2) find a valid Lyapunov function and controller on a given region, and 3) find a valid Lyapunov function and a controller such that the region of attraction is as large as possible. Previous work has shown that if the dynamics are piecewise linear, it is possible to solve problem 1) and 2) by solving a Mixed-Integer Linear Program (MILP). In this work, we build upon this method by proposing a Lyapunov neural network that considers monotonicity over half spaces in different directions. We 1) propose a specific choice of Lyapunov function architecture that ensures non-negativity and a unique global minimum by construction, and 2) show that this can be leveraged to find the controller and Lyapunov certificates faster and with a larger valid region by maximizing the size of a square inscribed in a given level set. We apply our method to a 2D inverted pendulum, unicycle path following, a 3-D feedback system, and a 4-D cart pole system, and demonstrate it can shorten the training time by half compared to the baseline, as well as find a larger ROA.  more » « less
Award ID(s):
2409733
PAR ID:
10628063
Author(s) / Creator(s):
; ;
Publisher / Repository:
IEEE
Date Published:
Journal Name:
Proceedings of the American Control Conference
ISSN:
2378-5861
ISBN:
979-8-3503-8265-5
Page Range / eLocation ID:
3403 to 3410
Format(s):
Medium: X
Location:
Toronto, ON, Canada
Sponsoring Org:
National Science Foundation
More Like this
  1. We present a framework that uses control Lyapunov functions (CLFs) to implement provably stable path-following controllers for autonomous mobile platforms. Our approach is based on learning a guaranteed CLF for path following by using recent approaches — combining machine learning with automated theorem proving — to train a neural network feedback law along with a CLF that guarantees stabilization for driving along low-curvature reference paths. We discuss how key properties of the CLF can be exploited to extend the range of the curvatures for which the stability guarantees remain valid. We then demonstrate that our approach yields a controller that obeys theoretical guarantees in simulation, but also performs well in practice. We show our method is both a verified method of control and better than a common MPC implementation in computation time. Additionally, we implement the controller on-board on a 18 -scale autonomous vehicle testing platform and present results for various robust path following scenarios. 
    more » « less
  2. This paper presents a deep neural network (DNN)-and concurrent learning (CL)-based adaptive control architecture for an Euler-Lagrange dynamic system that guarantees system performance for the first time. The developed controller includes two DNNs with the same output-layer weights to ensure feasibility of the control system. In this work, a Lyapunov-and CL-based update law is developed to update the output-layer DNN weights in real-time; whereas, the inner-layer DNN weights are updated offline using data that is collected in real-time. A Lyapunov-like analysis is performed to prove that the proposed controller yields semi-global exponential convergence to an ultimate bound for the output-layer weight estimation errors and for the trajectory tracking errors. 
    more » « less
  3. We present a method for contraction-based feedback motion planning of locally incrementally exponentially stabilizable systems with unknown dynamics that provides probabilistic safety and reachability guarantees. Given a dynamics dataset, our method learns a deep control-affine approximation of the dynamics. To find a trusted domain where this model can be used for planning, we obtain an estimate of the Lipschitz constant of the model error, which is valid with a given probability, in a region around the training data, providing a local, spatially-varying model error bound. We derive a trajectory tracking error bound for a contraction based controller that is subjected to this model error, and then learn a controller that optimizes this tracking bound. With a given probability, we verify the correctness of the controller and tracking error bound in the trusted domain. We then use the trajectory error bound together with the trusted domain to guide a sampling-based planner to return trajectories that can be robustly tracked in execution. We show results on a 4D car, a 6D quadrotor, and a 22D deformable object manipulation task, showing our method plans safely with learned models of highdimensional underactuated systems, while baselines that plan without considering the tracking error bound or the trusted domain can fail to stabilize the system and become unsafe. 
    more » « less
  4. This paper presents a counterexample-guided iterative algorithm to compute convex, piecewise linear (polyhedral) Lyapunov functions for continuous-time piecewise linear systems. Polyhedral Lyapunov functions provide an alternative to commonly used polynomial Lyapunov functions. Our approach first characterizes intrinsic properties of a polyhedral Lyapunov function including its “eccentricity” and “robustness” to perturbations. We then derive an algorithm that either computes a polyhedral Lyapunov function proving that the system is asymptotically stable, or concludes that no polyhedral Lyapunov function exists whose eccentricity and robustness parameters satisfy some user-provided limits. Significantly, our approach places no a-priori bound on the number of linear pieces that make up the desired polyhedral Lyapunov function. The algorithm alternates between a learning step and a verification step, always maintaining a finite set of witness states. The learning step solves a linear program to compute a candidate Lyapunov function compatible with a finite set of witness states. In the verification step, our approach verifies whether the candidate Lyapunov function is a valid Lyapunov function for the system. If verification fails, we obtain a new witness. We prove a theoretical bound on the maximum number of iterations needed by our algorithm. We demonstrate the applicability of the algorithm on numerical examples. 
    more » « less
  5. Common reinforcement learning methods seek optimal controllers for unknown dynamical systems by searching in the "policy" space directly. A recent line of research, starting with [1], aims to provide theoretical guarantees for such direct policy-update methods by exploring their performance in classical control settings, such as the infinite horizon linear quadratic regulator (LQR) problem. A key property these analyses rely on is that the LQR cost function satisfies the "gradient dominance" property with respect to the policy parameters. Gradient dominance helps guarantee that the optimal controller can be found by running gradient-based algorithms on the LQR cost. The gradient dominance property has so far been verified on a case-by-case basis for several control problems including continuous/discrete time LQR, LQR with decentralized controller, H2/H∞ robust control.In this paper, we make a connection between this line of work and classical convex parameterizations based on linear matrix inequalities (LMIs). Using this, we propose a unified framework for showing that gradient dominance indeed holds for a broad class of control problems, such as continuous- and discrete-time LQR, minimizing the L2 gain, and problems using system-level parameterization. Our unified framework provides insights into the landscape of the cost function as a function of the policy, and enables extending convergence results for policy gradient descent to a much larger class of problems. 
    more » « less