In this work, we propose a trajectory generation method for robotic systems with contact force constraint based on optimal control and reachability analysis. Normally, the dynamics and constraints of the contact-constrained robot are nonlinear and coupled to each other. Instead of linearizing the model and constraints, we directly solve the optimal control problem to obtain the feasible state trajectory and the control input of the system. A tractable optimal control problem is formulated which is addressed by dual approaches, which are sampling-based dynamic programming and rigorous reachability analysis. The sampling-based method and Partially Observable Markov Decision Process (POMDP) are used to break down the end-to-end trajectory generation problem via sample-wise optimization in terms of given conditions. The result generates sequential pairs of subregions to be passed to reach the final goal. The reachability analysis ensures that we will find at least one trajectory starting from a given initial state and going through a sequence of subregions. The distinctive contributions of our method are to enable handling the intricate contact constraint coupled with system’s dynamics due to the reduction of computational complexity of the algorithm. We validate our method using extensive numerical simulations with a legged robot.
more »
« less
Optimal control of differentially flat systems is surprisingly easy
As we move to increasingly complex cyber–physical systems (CPS), new approaches are needed to plan efficient state trajectories in real-time. In this paper, we propose an approach to significantly reduce the complexity of solving optimal control problems for a class of CPS with nonlinear dynamics. We exploit the property of differential flatness to simplify the Euler–Lagrange equations that arise during optimization, and this simplification eliminates the numerical instabilities that plague optimal control in general. We also present an explicit differential equation that describes the evolution of the optimal state trajectory, and we extend our results to consider both the unconstrained and constrained cases. Furthermore, we demonstrate the performance of our approach by generating the optimal trajectory for a planar manipulator with two revolute joints. We show in simulation that our approach is able to generate the constrained optimal trajectory in 4.5 ms while respecting workspace constraints and switching between a ‘left’ and ‘right’ bend in the elbow joint.
more »
« less
- PAR ID:
- 10508512
- Editor(s):
- NA
- Publisher / Repository:
- Elsevier
- Date Published:
- Journal Name:
- Automatica
- Edition / Version:
- NA
- Volume:
- 159
- Issue:
- C
- ISSN:
- 0005-1098
- Page Range / eLocation ID:
- 111404
- Subject(s) / Keyword(s):
- NA
- Format(s):
- Medium: X Size: NA Other: NA
- Size(s):
- NA
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Most cyber–physical systems (CPS) encounter a large volume of data which is added to the system gradually in real time and not altogether in advance. In this paper, we provide a theoretical framework that yields optimal control strategies for such CPS at the intersection of control theory and learning. In the proposed framework, we use the actual CPS, i.e., the ‘‘true" system that we seek to optimally control online, in parallel with a model of the CPS that is available. We then institute an information state for the system which does not depend on the control strategy. An important consequence of this independence is that for any given choice of a control strategy and a realization of the system’s variables until time t, the information states at future times do not depend on the choice of the control strategy at time t but only on the realization of the decision at time t, and thus they are related to the concept of separation between estimation of the state and control. Namely, the future information states are separated from the choice of the current control strategy. Such control strategies are called separated control strategies. Hence, we can derive offline the optimal control strategy of the system with respect to the information state, which might not be precisely known due to model uncertainties or complexity of the system, and then use standard learning approaches to learn the information state online while data are added gradually to the system in real time. We show that after the information state becomes known, the separated control strategy of the CPS model derived offline is optimal for the actual system. We illustrate the proposed framework in a dynamic system consisting of two subsystems with a delayed sharing information structure.more » « less
-
Control systems are increasingly targeted by malicious adversaries, who may inject spurious sensor measurements in order to bias the controller behavior and cause suboptimal performance or safety violations. This paper investigates the problem of tracking a reference trajectory while satisfying safety and reachability constraints in the presence of such false data injection attacks. We consider a linear, time-invariant system with additive Gaussian noise in which a subset of sensors can be compromised by an attacker, while the remaining sensors are regarded as secure. We propose a control policy in which two estimates of the system state are maintained, one based on all sensors and one based on only the secure sensors. The optimal control action based on the secure sensors alone is then computed at each time step, and the chosen control action is constrained to lie within a given distance of this value. We show that this policy can be implemented by solving a quadraticallyconstrained quadratic program at each time step. We develop a barrier function approach to choosing the parameters of our scheme in order to provide provable guarantees on safety and reachability, and derive bounds on the probability that our control policies deviate from the optimal policy when no attacker is present. Our framework is validated through numerical study.more » « less
-
Nonlinear optimal control problems are challenging to solve efficiently due to non-convexity. This paper introduces a trajectory optimization approach that achieves real-time performance by combining machine learning to predict optimal trajectories with refinement by quadratic optimization. First, a library of optimal trajectories is calculated offline and used to train a neural network. Online, the neural network predicts a trajectory for a novel initial state and cost function, and this prediction is further optimized by a sparse quadratic programming solver. We apply this approach to a fly-to-target movement problem for an indoor quadrotor. Experiments demonstrate that the technique calculates near-optimal trajectories in a few milliseconds, and generates agile movement that can be tracked more accurately than existing methods.more » « less
-
We present a neural network approach for approximating the value function of high- dimensional stochastic control problems. Our training process simultaneously updates our value function estimate and identifies the part of the state space likely to be visited by optimal trajectories. Our approach leverages insights from optimal control theory and the fundamental relation between semi-linear parabolic partial differential equations and forward-backward stochastic differential equations. To focus the sampling on relevant states during neural network training, we use the stochastic Pontryagin maximum principle (PMP) to obtain the optimal controls for the current value function estimate. By design, our approach coincides with the method of characteristics for the non-viscous Hamilton-Jacobi-Bellman equation arising in deterministic control problems. Our training loss consists of a weighted sum of the objective functional of the control problem and penalty terms that enforce the HJB equations along the sampled trajectories. Importantly, training is unsupervised in that it does not require solutions of the control problem. Our numerical experiments highlight our scheme’s ability to identify the relevant parts of the state space and produce meaningful value estimates. Using a two-dimensional model problem, we demonstrate the importance of the stochastic PMP to inform the sampling and compare to a finite element approach. With a nonlinear control affine quadcopter example, we illustrate that our approach can handle complicated dynamics. For a 100-dimensional benchmark problem, we demonstrate that our approach improves accuracy and time-to-solution and, via a modification, we show the wider applicability of our scheme.more » « less
An official website of the United States government

