skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Data-Driven Hierarchical Predictive Learning in Unknown Environments
We propose a hierarchical learning architecture for predictive control in unknown environments. We consider a constrained nonlinear dynamical system and assume the availability of state-input trajectories solving control tasks in different environments. A parameterized environment model generates state constraints specific to each task, which are satisfied by the stored trajectories. Our goal is to find a feasible trajectory for a new task in an unknown environment. From stored data, we learn strategies in the form of target sets in a reduced-order state space. These strategies are applied to the new task in real-time using a local forecast of the new environment, and the resulting output is used as a terminal region by a low-level receding horizon controller. We show how to i) design the target sets from past data and then ii) incorporate them into a model predictive control scheme with shifting horizon that ensures safety of the closed-loop system when performing the new task. We prove the feasibility of the resulting control policy, and verify the proposed method in a robotic path planning application.  more » « less
Award ID(s):
1931853
PAR ID:
10176532
Author(s) / Creator(s):
;
Date Published:
Journal Name:
IEEE International Conference on Automation Science and Engineering CASE
ISSN:
2161-8070
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. A Task Decomposition method for iterative learning Model Predictive Control (TDMPC) for linear time-varying systems is presented. We consider the availability of state- input trajectories which solve an original task T1, and design a feasible MPC policy for a new task, T2, using stored data from T1. Our approach applies to tasks T2 which are composed of subtasks contained in T1. In this paper we formally define the task decomposition problem, and provide a feasibility proof for the resulting policy. The proposed algorithm reduces the computational burden for linear time-varying systems with piecewise convex constraints. Simulation results demonstrate the improved efficiency of the proposed method on a robotic path-planning task. 
    more » « less
  2. n this paper, we focus on the problem of shrinking-horizon Model Predictive Control (MPC) in uncertain dynamic environments. We consider controlling a deterministic autonomous system that interacts with uncontrollable stochastic agents during its mission. Employing tools from conformal prediction, existing works derive high-confidence prediction regions for the unknown agent trajectories, and integrate these regions in the design of suitable safety constraints for MPC. Despite guaranteeing probabilistic safety of the closed-loop trajectories, these constraints do not ensure feasibility of the respective MPC schemes for the entire duration of the mission. We propose a shrinking-horizon MPC that guarantees recursive feasibility via a gradual relaxation of the safety constraints as new prediction regions become available online. This relaxation enforces the safety constraints to hold over the least restrictive prediction region from the set of all available prediction regions. In a comparative case study with the state of the art, we empirically show that our approach results in tighter prediction regions and verify recursive feasibility of our MPC scheme. 
    more » « less
  3. In this study, we address the problem of safe control in systems subject to state and input constraints by integrating the Control Barrier Function (CBF) into the Model Predictive Control (MPC) formulation. While CBF offers a conservative policy and traditional MPC lacks the safety guarantee beyond the finite horizon, the proposed scheme takes advantage of both MPC and CBF approaches to provide a guaranteed safe control policy with reduced conservatism and a shortened horizon. The proposed methodology leverages the sum-of-square (SOS) technique to construct CBFs that make forward invariant safe sets in the state space that are then used as a terminal constraint on the last predicted state. CBF invariant sets cover the state space around system fixed points. These islands of forward invariant CBF sets will be connected to each other using MPC. To do this, we proposed a technique to handle the MPC optimization problem subject to the combination of intersections and union of constraints. Our approach, termed Model Predictive Control Barrier Functions (MPCBF), is validated using numerical examples to demonstrate its efficacy, showing improved performance compared to classical MPC and CBF. 
    more » « less
  4. Intelligent mobile sensors, such as uninhabited aerial or underwater vehicles, are becoming prevalent in environmental sensing and monitoring applications. These active sensing platforms operate in unsteady fluid flows, including windy urban environments, hurricanes and ocean currents. Often constrained in their actuation capabilities, the dynamics of these mobile sensors depend strongly on the background flow, making their deployment and control particularly challenging. Therefore, efficient trajectory planning with partial knowledge about the background flow is essential for teams of mobile sensors to adaptively sense and monitor their environments. In this work, we investigate the use of finite-horizon model predictive control (MPC) for the energy-efficient trajectory planning of an active mobile sensor in an unsteady fluid flow field. We uncover connections between trajectories optimized over a finite-time horizon and finite-time Lyapunov exponents of the background flow, confirming that energy-efficient trajectories exploit invariant coherent structures in the flow. We demonstrate our findings on the unsteady double gyre vector field, which is a canonical model for chaotic mixing in the ocean. We present an exhaustive search through critical MPC parameters including the prediction horizon, maximum sensor actuation, and relative penalty on the accumulated state error and actuation effort. We find that even relatively short prediction horizons can often yield energy-efficient trajectories. We also explore these connections on a three-dimensional flow and ocean flow data from the Gulf of Mexico. These results are promising for the adaptive planning of energy-efficient trajectories for swarms of mobile sensors in distributed sensing and monitoring. 
    more » « less
  5. This article introduces a model-based approach for training feedback controllers for an autonomous agent operating in a highly non-linear (albeit deterministic) environment. We desire the trained policy to ensure that the agent satisfies specific task objectives and safety constraints, both expressed in Discrete-Time Signal Temporal Logic (DT-STL). One advantage for reformulation of a task via formal frameworks, like DT-STL, is that it permits quantitative satisfaction semantics. In other words, given a trajectory and a DT-STL formula, we can compute therobustness, which can be interpreted as an approximate signed distance between the trajectory and the set of trajectories satisfying the formula. We utilize feedback control, and we assume a feed forward neural network for learning the feedback controller. We show how this learning problem is similar to training recurrent neural networks (RNNs), where the number of recurrent units is proportional to the temporal horizon of the agent’s task objectives. This poses a challenge: RNNs are susceptible to vanishing and exploding gradients, and naïve gradient descent-based strategies to solve long-horizon task objectives thus suffer from the same problems. To address this challenge, we introduce a novel gradient approximation algorithm based on the idea of dropout or gradient sampling. One of the main contributions is the notion ofcontroller network dropout, where we approximate the NN controller in several timesteps in the task horizon by the control input obtained using the controller in a previous training step. We show that our control synthesis methodology can be quite helpful for stochastic gradient descent to converge with less numerical issues, enabling scalable back-propagation over longer time horizons and trajectories over higher-dimensional state spaces. We demonstrate the efficacy of our approach on various motion planning applications requiring complex spatio-temporal and sequential tasks ranging over thousands of timesteps. 
    more » « less