This paper presents a learning-based methodology for developing an optimal lane-changing control policy for a Remote Controlled (RC) car using real-time sensor data. The RC car is equipped with sensors including GPS, IMU devices, and a camera integrated in an Nvidia Jetson AGX Xavier board. By a novel Adaptive Dynamic Programming (ADP) algorithm, our RC car is capable of learning the optimal lane-changing strategies based on the real-time processed measurement from the sensors. The experimental outcomes show that our learning-based control algorithm can be effectively implemented, adapt to parameter changes, and complete the lane changing tasks in a short learning time with satisfactory performance.
more »
« less
Automated lane changing control in mixed traffic: An adaptive dynamic programming approach
The majority of the past research dealing with lane-changing controller design of autonomous vehicles (π΄π s) is based on the assumption of full knowledge of the model dynamics of the π΄π and the surrounding vehicles. However, in the real world, this is not a very realistic assumption as accurate dynamic models are difficult to obtain. Also, the dynamic model parameters might change over time due to various factors. Thus, there is a need for a learning-based lane change controller design methodology that can learn the optimal control policy in real time using sensor data. In this paper, we have addressed this need by introducing an optimal learningbased control methodology that can solve the real-time lane-changing problem of π΄π s, where the input-state data of the π΄π is utilized to generate a near-optimal lane-changing controller by approximate/adaptive dynamic programming (ADP) technique. In the case of this type of complex lane-changing maneuver, the lateral dynamics depend on the longitudinal velocity of the vehicle. If the longitudinal velocity is assumed constant, a linear parameter invariant model can be used. However, assuming constant velocity while performing a lane-changing maneuver is not a realistic assumption. This assumption might increase the risk of accidents, especially in the case of lane abortion when the surrounding vehicles are not cooperative. Thus, in this paper, the dynamics of the π΄π are assumed to be a linear parameter-varying system. Thus we have two challenges for the lane-changing controller design: parameter-varying, and unknown dynamics. With the help of both gain scheduling and ADP techniques combined, a learning-based control algorithm that can generate a near-optimal lane-changing controller without having to know the accurate dynamic model of the π΄π is proposed. The inclusion of a gain scheduling approach with ADP makes the controller applicable to non-linear and/or parameter-varying π΄π dynamics. The stability of the learning-based gain scheduling controller has also been rigorously proved. Moreover, a data-driven lane-changing decision-making algorithm is introduced that can make the π΄π perform a lane abortion if safety conditions are violated during a lane change. Finally, the proposed learning-based gain scheduling controller design algorithm and the lane-changing decision-making methodology are numerically validated using MATLAB, SUMO simulations, and the NGSIM dataset.
more »
« less
- PAR ID:
- 10626556
- Publisher / Repository:
- Elsevier
- Date Published:
- Journal Name:
- Transportation Research Part B: Methodological
- Volume:
- 187
- Issue:
- C
- ISSN:
- 0191-2615
- Page Range / eLocation ID:
- 103026
- Subject(s) / Keyword(s):
- Autonomous vehicle Lane changing Learning-based optimal control Gain scheduling
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
This paper proposes a novel learning-based adaptive optimal controller design method for a class of continuous-time linear time-delay systems. A key strategy is to exploit the state-of-the-art reinforcement learning (RL) techniques and adaptive dynamic programming (ADP), and propose a data-driven method to learn the near-optimal controller without the precise knowledge of system dynamics. Specifically, a value iteration (VI) algorithm is proposed to solve the infinite-dimensional Riccati equation for the linear quadratic optimal control problem of time-delay systems using finite samples of input-state trajectory data. It is rigorously proved that the proposed VI algorithm converges to the near-optimal solution. Compared with the previous literature, the nice features of the proposed VI algorithm are that it is directly developed for continuous-time systems without discretization and an initial admissible controller is not required for implementing the algorithm. The efficacy of the proposed methodology is demonstrated by two practical examples of metal cutting and autonomous driving.more » « less
-
This paper introduces a learning-based optimal control strategy enhanced with nonmodel-based state estimation to manage the complexities of lane-changing maneuvers in autonomous vehicles. Traditional approaches often depend on comprehensive system state information, which may not always be accessible or accurate due to dynamic traffic environments and sensor limitations. Our methodology dynamically adapts to these uncertainties and sensor noise by iteratively refining its control policy based on real-time sensor data and reconstructed states. We implemented an experimental setup featuring a scaled vehicle equipped with GPS, IMUs, and cameras, all processed through an Nvidia Jetson AGX Xavier board. This approach is pivotal as it addresses the limitations of simulations, which often fail to capture the complexity of dynamic real-world conditions. The results from real-world experiments demonstrate that our learning-based control system achieves smoother and more consistent lane-changing behavior compared to traditional direct measurement approaches. This paper underscores the effectiveness of integrating Adaptive Dynamic Programming (ADP) with state estimation techniques, as demonstrated through small-scale experiments. These experiments are crucial as they provide a practical validation platform that simulates real-world complexities, representing a significant advancement in the control systems used for autonomous driving.more » « less
-
This paper studies the problem of data-driven combined longitudinal and lateral control of autonomous vehicles (AVs) such that the AV can stay within a safe but minimum distance from its leading vehicle and, at the same time, in the lane. Most of the existing methods for combined longitudinal and lateral control are either model-based or developed by purely data-driven methods such as reinforcement learning. Traditional model-based control approaches are insufficient to address the adaptive optimal control design issue for AVs in dynamically changing environments and subject to model uncertainty. Moreover, the conventional reinforcement learning approaches require a large volume of data, and cannot guarantee the stability of the vehicle. These limitations are addressed by integrating the advanced control theory with reinforcement learning techniques. To be more specific, by utilizing adaptive dynamic programming techniques and using the motion data collected from the vehicles, a policy iteration algorithm is proposed such that the control policy is iteratively optimized in the absence of the precise knowledge of the AVβs dynamical model. Furthermore, the stability of the AV is guaranteed with the control policy generated at each iteration of the algorithm. The efficiency of the proposed approach is validated by SUMO simulation, a microscopic traffic simulation platform, for different traffic scenarios.more » « less
-
This paper studies the infinite-horizon adaptive optimal control of continuous-time linear periodic (CTLP) systems. A novel value iteration (VI) based off-policy ADP algorithm is proposed for a general class of CTLP systems, so that approximate optimal solutions can be obtained directly from the collected data, without the exact knowledge of system dynamics. Under mild conditions, the proofs on uniform convergence of the proposed algorithm to the optimal solutions are given for both the model-based and model-free cases. The VI-based ADP algorithm is able to find suboptimal controllers without assuming the knowledge of an initial stabilizing controller. Application to the optimal control of a triple inverted pendulum subjected to a periodically varying load demonstrates the feasibility and effectiveness of the proposed method.more » « less
An official website of the United States government

