skip to main content

This content will become publicly available on June 1, 2023

Title: Hybrid Reinforcement Learning based controller for autonomous navigation
Safe operations of autonomous mobile robots in close proximity to humans, creates a need for enhanced trajectory tracking (with low tracking errors). Linear optimal control techniques such as Linear Quadratic Regulator (LQR) and Model Predictive Control (MPC) have been used successfully for low-speed applications while leveraging their model-based methodology with manageable computational demands. However, model and parameter uncertainties or other unmodeled nonlinearities may cause poor control actions and constraint violations. Nonlinear MPC has emerged as an alternate optimal-control approach but needs to overcome real-time deployment challenges (including fast sampling time, design complexity, and limited computational resources). In recent years, the optimal control-based deployments have benefitted enormously from the ability of Deep Neural Networks (DNNs) to serve as universal function approximators. This has led to deployments in a plethora of previously inaccessible applications – but many aspects of generalizability, benchmarking, and systematic verification and validation coupled with benchmarking have emerged. This paper presents a novel approach to fusing Deep Reinforcement Learning-based (DRL) longitudinal control with a traditional PID lateral controller for autonomous navigation. Our approach follows (i) Generation of an adequate fidelity simulation scenario via a Real2Sim approach; (ii) training a DRL agent within this framework; (iii) Testing the performance and more » generalizability on alternate scenarios. We use an initial tuned set of the lateral PID controller gains for observing the vehicle response over a range of velocities. Then we use a DRL framework to generate policies for an optimal longitudinal controller that successfully complements the lateral PID to give the best tracking performance for the vehicle. « less
Authors:
; ; ;
Award ID(s):
1925500 1939058
Publication Date:
NSF-PAR ID:
10357358
Journal Name:
2022 IEEE 95th Vehicular Technology Conference, VTC2022-Spring
Page Range or eLocation-ID:
1-6
Sponsoring Org:
National Science Foundation
More Like this
  1. Autonomous vehicle trajectory tracking control is challenged by situations of varying road surface friction, especially in the scenario where there is a sudden decrease in friction in an area with high road curvature. If the situation is unknown to the control law, vehicles with high speed are more likely to lose tracking performance and/or stability, resulting in loss of control or the vehicle departing the lane unexpectedly. However, with connectivity either to other vehicles, infrastructure, or cloud services, vehicles may have access to upcoming roadway information, particularly the friction and curvature in the road path ahead. This paper introduces a model-based predictive trajectory-tracking control structure using the previewed knowledge of path curvature and road friction. In the structure, path following and vehicle stabilization are incorporated through a model predictive controller. Meanwhile, long-range vehicle speed planning and tracking control are integrated to ensure the vehicle can slow down appropriately before encountering hazardous road conditions. This approach has two major advantages. First, the prior knowledge of the desired path is explicitly incorporated into the computation of control inputs. Second, the combined transmission of longitudinal and lateral tire forces is considered in the controller to avoid violation of tire force limits while keepingmore »performance and stability guarantees. The efficacy of the algorithm is demonstrated through an application case where a vehicle navigates a sharply curving road with varying friction conditions, with results showing that the controller can drive a vehicle up to the handling limits and track the desired trajectory accurately.« less
  2. Model predictive control (MPC) has become more relevant to vehicle dynamics control due to its inherent capacity of treating system constraints. However, online optimization from MPC introduces an extensive computational burden for today’s onboard microprocessors. To alleviate MPC computational load, several methods have been proposed. Among them, online successive system linearization and the resulting linear time-varying model predictive controller (LTVMPC) is one of the most popular options. Nevertheless, such online successive linearization commonly approximates the original (nonlinear) system by a linear one, which inevitably introduces extra modeling errors and therefore reduces MPC performance. Actually, if the controlled system possesses the “differential flatness” property, then it can be exactly linearized and an equivalent linear model will appear. This linear model maintains all the nonlinear features of the original system and can be utilized to design a flatness-based model predictive controller (FMPC). CarSim-Simulink joint simulations demonstrate that the proposed FMPC substantially outperforms a classical LTVMPC in terms of the path-tracking performance for autonomous vehicles.
  3. The Current practice of air-fuel ratio control relies on empirical models and traditional PID controllers, which require extensive calibration to maintain the post-catalyst air-fuel ratio close to stoichiometry. In contrast, this work utilizes a physics-based Three-Way Catalyst (TWC) model to develop a model predictive control (MPC) strategy for air-fuel ratio control based on internal TWC oxygen storage dynamics. In this paper, parameters of the physics-based temperature and oxygen storage models of the TWC are identified using vehicle test data for a catalyst aged to 150,000 miles. A linearized oxygen storage model is then developed from the identified nonlinear model, which is shown via simulation to follow the nonlinear model with minimal error during nominal operation. This motivates the development of a Linear MPC (LMPC) framework using the linearized TWC oxygen storage model, reducing the requisite computational effort relative to a nonlinear MPC strategy. In this work, the LMPC utilizing a linearized physics-based TWC model is proven suitable for tracking a desired oxygen storage level by controlling the commanded engine air-fuel ratio, which is also a novel contribution. The offline simulation results show successful tracking performance of the developed LMPC framework.
  4. This paper presents four data-driven system models for a magnetically controlled swimmer. The models were derived directly from experimental data, and the accuracy of the models was experimentally demonstrated. Our previous study successfully implemented two non-model-based control algorithms for 3D path-following using PID and model reference adaptive controller (MRAC). This paper focuses on system identification using only experimental data and a model-based control strategy. Four system models were derived: (1) a physical estimation model, (2, 3) Sparse Identification of Nonlinear Dynamics (SINDY), linear system and nonlinear system, and (4) multilayer perceptron (MLP). All four system models were implemented as an estimator of a multi-step Kalman filter. The maximum required sensing interval was increased from 180 ms to 420 ms and the respective tracking error decreased from 9 mm to 4.6 mm. Finally, a Model Predictive Controller (MPC) implementing the linear SINDY model was tested for 3D path-following and shown to be computationally efficient and offers performances comparable to other control methods.
  5. Motivated by connected and automated vehicle (CAV) technologies, this paper proposes a data-driven optimization-based Model Predictive Control (MPC) modeling framework for the Cooperative Adaptive Cruise Control (CACC) of a string of CAVs under uncertain traffic conditions. The proposed data-driven optimization-based MPC modeling framework aims to improve the stability, robustness, and safety of longitudinal cooperative automated driving involving a string of CAVs under uncertain traffic conditions using Vehicle-to-Vehicle (V2V) data. Based on an online learning-based driving dynamics prediction model, we predict the uncertain driving states of the vehicles preceding the controlled CAVs. With the predicted driving states of the preceding vehicles, we solve a constrained Finite-Horizon Optimal Control problem to predict the uncertain driving states of the controlled CAVs. To obtain the optimal acceleration or deceleration commands for the CAVs under uncertainties, we formulate a Distributionally Robust Stochastic Optimization (DRSO) model (i.e. a special case of data-driven optimization models under moment bounds) with a Distributionally Robust Chance Constraint (DRCC). The predicted uncertain driving states of the immediately preceding vehicles and the controlled CAVs will be utilized in the safety constraint and the reference driving states of the DRSO-DRCC model. To solve the minimax program of the DRSO-DRCC model, we reformulate themore »relaxed dual problem as a Semidefinite Program (SDP) of the original DRSO-DRCC model based on the strong duality theory and the Semidefinite Relaxation technique. In addition, we propose two methods for solving the relaxed SDP problem. We use Next Generation Simulation (NGSIM) data to demonstrate the proposed model in numerical experiments. The experimental results and analyses demonstrate that the proposed model can obtain string-stable, robust, and safe longitudinal cooperative automated driving control of CAVs by proper settings, including the driving-dynamics prediction model, prediction horizon lengths, and time headways. Computational analyses are conducted to validate the efficiency of the proposed methods for solving the DRSO-DRCC model for real-time automated driving applications within proper settings.« less