This paper studies the infinite-horizon adaptive optimal
control of continuous-time linear periodic (CTLP) systems.
A novel value iteration (VI) based off-policy ADP algorithm
is proposed for a general class of CTLP systems, so that
approximate optimal solutions can be obtained directly from the
collected data, without the exact knowledge of system dynamics.
Under mild conditions, the proofs on uniform convergence of
the proposed algorithm to the optimal solutions are given for
both the model-based and model-free cases. The VI-based ADP
algorithm is able to find suboptimal controllers without assuming
the knowledge of an initial stabilizing controller. Application
to the optimal control of a triple inverted pendulum subjected
to a periodically varying load demonstrates the feasibility and
effectiveness of the proposed method.
more »
« less
Myopic Control of Systems with Unknown Dynamics
This paper introduces a strategy for satisfying basic control objectives for systems whose dynamics are almost entirely unknown. This setting is motivated by a scenario where a system undergoes a critical failure, thus significantly changing its dynamics. In such a case, retaining the ability to satisfy basic control objectives such as reach-avoid is imperative. To deal with significant restrictions on our knowledge of system dynamics, we develop a theory of myopic control. The primary goal of myopic control is to, at any given time, optimize the current direction of the system trajectory, given solely the limited information obtained about the system until that time. Building upon this notion, we propose a control algorithm which simultaneously uses small perturbations in the control effort to learn local system dynamics while moving in the direction which seems to be optimal based on previously obtained knowledge. We show that the algorithm results in a trajectory that is nearly optimal in the myopic sense, i.e., it is moving in a direction that seems to be nearly the best at the given time, and provide formal bounds for suboptimality. We demonstrate the usefulness of the proposed algorithm on a high-fidelity simulation of a damaged Boeing 747 seeking to remain in level flight.
more »
« less
- Award ID(s):
- 1700404
- PAR ID:
- 10170418
- Date Published:
- Journal Name:
- 2019 American Control Conference (ACC)
- Page Range / eLocation ID:
- 1064 to 1071
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
In this work, we propose a trajectory generation method for robotic systems with contact force constraint based on optimal control and reachability analysis. Normally, the dynamics and constraints of the contact-constrained robot are nonlinear and coupled to each other. Instead of linearizing the model and constraints, we directly solve the optimal control problem to obtain the feasible state trajectory and the control input of the system. A tractable optimal control problem is formulated which is addressed by dual approaches, which are sampling-based dynamic programming and rigorous reachability analysis. The sampling-based method and Partially Observable Markov Decision Process (POMDP) are used to break down the end-to-end trajectory generation problem via sample-wise optimization in terms of given conditions. The result generates sequential pairs of subregions to be passed to reach the final goal. The reachability analysis ensures that we will find at least one trajectory starting from a given initial state and going through a sequence of subregions. The distinctive contributions of our method are to enable handling the intricate contact constraint coupled with system’s dynamics due to the reduction of computational complexity of the algorithm. We validate our method using extensive numerical simulations with a legged robot.more » « less
-
This paper addresses the problem of learning the optimal control policy for a nonlinear stochastic dynam- ical. This problem is subject to the ‘curse of dimension- ality’ associated with the dynamic programming method. This paper proposes a novel decoupled data-based con- trol (D2C) algorithm that addresses this problem using a decoupled, ‘open-loop - closed-loop’, approach. First, an open-loop deterministic trajectory optimization problem is solved using a black-box simulation model of the dynamical system. Then, closed-loop control is developed around this open-loop trajectory by linearization of the dynamics about this nominal trajectory. By virtue of linearization, a linear quadratic regulator based algorithm can be used for this closed-loop control. We show that the performance of D2C algorithm is approximately optimal. Moreover, simulation performance suggests a significant reduction in training time compared to other state of the art algorithms.more » « less
-
This paper proposes a novel solution for the distributed unconstrained optimization problem where the total cost is the summation of time-varying local cost functions of a group networked agents. The objective is to track the optimal trajectory that minimizes the total cost at each time instant. Our approach consists of a two-stage dynamics, where the first one samples the first and second derivatives of the local costs periodically to construct an estimate of the descent direction towards the optimal trajectory, and the second one uses this estimate and a consensus term to drive local states towards the time-varying solution while reaching consensus. The first part is carried out by a weighted average consensus algorithm in the discrete-time framework and the second part is performed with a continuous-time dynamics. Using the Lyapunov stability analysis, an upper bound on the gradient of the total cost is obtained which is asymptotically reached. This bound is characterized by the properties of the local costs. To demonstrate the performance of the proposed method, a numerical example is conducted that studies tuning the algorithm’s parameters and their effects on the convergence of local states to the optimal trajectory.more » « less
-
This paper investigates a control strategy in which the state of a dynamical system is driven slowly along a trajectory of stable equilibria. This trajectory is a continuum set of points in the state space, each one representing a stable equilibrium of the system under some constant control input. Along the continuous trajectory of such constant control inputs, a slowly varying control is then applied to the system, aimed to create a stable quasistatic equilibrium that slowly moves along the trajectory of equilibria. As a stable equilibrium attracts the state of system within its vicinity, by moving the equilibrium slowly along the trajectory of equilibria, the state of system travels near this trajectory alongside the equilibrium. Despite the disadvantage of being slow, this control strategy is attractive for certain applications, as it can be implemented based only on partial knowledge of the system dynamics. This feature is in particular important for the complex systems for which detailed dynamical models are not available.more » « less