skip to main content

Title: Control of Hybrid Electric Vehicle Powertrain Using Offline-Online Hybrid Reinforcement Learning
Hybrid electric vehicles can achieve better fuel economy than conventional vehicles by utilizing multiple power sources. While these power sources have been controlled by rule-based or optimization-based control algorithms, recent studies have shown that machine learning-based control algorithms such as online Deep Reinforcement Learning (DRL) can effectively control the power sources as well. However, the optimization and training processes for the online DRL-based powertrain control strategy can be very time and resource intensive. In this paper, a new offline–online hybrid DRL strategy is presented where offline vehicle data are exploited to build an initial model and an online learning algorithm explores a new control policy to further improve the fuel economy. In this manner, it is expected that the agent can learn an environment consisting of the vehicle dynamics in a given driving condition more quickly compared to the online algorithms, which learn the optimal control policy by interacting with the vehicle model from zero initial knowledge. By incorporating a priori offline knowledge, the simulation results show that the proposed approach not only accelerates the learning process and makes the learning process more stable, but also leads to a better fuel economy compared to online only learning algorithms.  more » « less
Award ID(s):
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Page Range / eLocation ID:
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Hybrid electric vehicles employ a hybrid propulsion system to combine the energy efficiency of electric motor and a long driving range of internal combustion engine, thereby achieving a higher fuel economy as well as convenience compared with conventional ICE vehicles. However, the relatively complicated powertrain structures of HEVs necessitate an effective power management policy to determine the power split between ICE and EM. In this work, we propose a deep reinforcement learning framework of the HEV power management with the aim of improving fuel economy. The DRL technique is comprised of an offline deep neural network construction phase and an online deep Q-learning phase. Unlike traditional reinforcement learning, DRL presents the capability of handling the high dimensional state and action space in the actual decision-making process, making it suitable for the HEV power management problem. Enabled by the DRL technique, the derived HEV power management policy is close to optimal, fully model-free, and independent of a prior knowledge of driving cycles. Simulation results based on actual vehicle setup over real-world and testing driving cycles demonstrate the effectiveness of the proposed framework on optimizing HEV fuel economy. 
    more » « less
  2. null (Ed.)
    This paper investigates optimal power management of a fuel cell hybrid small unmanned aerial vehicle (sUAV) from the perspective of endurance (time of flight) maximization in a stochastic environment. Stochastic drift counteraction optimal control is exploited to obtain an optimal policy for power management that coordinates the operation of the fuel cell and battery to maximize the expected flight time while accounting for the limits on the rate of change of fuel cell power output and the orientation dependence of fuel cell efficiency. The proposed power management strategy accounts for known statistics in transitions of propeller power and climb angle during the mission, but does not require the exact preview of their time histories. The optimal control policy is generated offline using value iterations implemented in Cython, demonstrating an order of magnitude speedup as compared to MATLAB. It is also shown that the value iterations can be further sped up using a discount factor, but at the cost of decreased performance. Simulation results for a 1.5 kg sUAV are reported that illustrate the optimal coordination between the fuel cell and the battery during aircraft maneuvers, including a turnpike in the battery state of charge (SOC) trajectory. As the fuel cell is not able to support fast changes in power output, the optimal policy is shown to charge the battery to the turnpike value if starting from a low initial SOC value. If starting from a high SOC value, the battery energy is used till a turnpike value of the SOC is reached with further discharge delayed to later in the flight. For the specific scenarios and simulated sUAV parameters considered, the results indicate the capability of up to 2.7 h of flight time. 
    more » « less
  3. null (Ed.)
    While Deep Reinforcement Learning has emerged as a de facto approach to many complex experience-driven networking problems, it remains challenging to deploy DRL into real systems. Due to the random exploration or half-trained deep neural networks during the online training process, the DRL agent may make unexpected decisions, which may lead to system performance degradation or even system crash. In this paper, we propose PnP-DRL, an offline-trained, plug and play DRL solution, to leverage the batch reinforcement learning approach to learn the best control policy from pre-collected transition samples without interacting with the system. After being trained without interaction with systems, our Plug and Play DRL agent will start working seamlessly, without additional exploration or possible disruption of the running systems. We implement and evaluate our PnP-DRL solution on a prevalent experience-driven networking problem, Dynamic Adaptive Streaming over HTTP (DASH). Extensive experimental results manifest that 1) The existing batch reinforcement learning method has its limits; 2) Our approach PnP-DRL significantly outperforms classical adaptive bitrate algorithms in average user Quality of Experience (QoE); 3) PnP-DRL, unlike the state-of-the-art online DRL methods, can be off and running without learning gaps, while achieving comparable performances. 
    more » « less
  4. Abstract

    Vehicle‐to‐Everything (V2X) communication has been proposed as a potential solution to improve the robustness and safety of autonomous vehicles by improving coordination and removing the barrier of non‐line‐of‐sight sensing. Cooperative Vehicle Safety (CVS) applications are tightly dependent on the reliability of the underneath data system, which can suffer from loss of information due to the inherent issues of their different components, such as sensors' failures or the poor performance of V2X technologies under dense communication channel load. Particularly, information loss affects the target classification module and, subsequently, the safety application performance. To enable reliable and robust CVS systems that mitigate the effect of information loss, a Context‐Aware Target Classification (CA‐TC) module coupled with a hybrid learning‐based predictive modeling technique for CVS systems is proposed. The CA‐TC consists of two modules: a Context‐Aware Map (CAM), and a Hybrid Gaussian Process (HGP) prediction system. Consequently, the vehicle safety applications use the information from the CA‐TC, making them more robust and reliable. The CAM leverages vehicles' path history, road geometry, tracking, and prediction; and the HGP is utilized to provide accurate vehicles' trajectory predictions to compensate for data loss (due to communication congestion) or sensor measurements' inaccuracies. Based on offline real‐world data, a finite bank of driver models that represent the joint dynamics of the vehicle and the drivers' behavior is learned. Offline training and online model updates are combined with on‐the‐fly forecasting to account for new possible driver behaviors. Finally, the framework is validated using simulation and realistic driving scenarios to confirm its potential in enhancing the robustness and reliability of CVS systems.

    more » « less
  5. Abstract Battery electric vehicles (BEVs) have emerged as a promising alternative to traditional internal combustion engine (ICE) vehicles due to benefits in improved fuel economy, lower operating cost, and reduced emission. BEVs use electric motors rather than fossil fuels for propulsion and typically store electric energy in lithium-ion cells. With rising concerns over fossil fuel depletion and the impact of ICE vehicles on the climate, electric mobility is widely considered as the future of sustainable transportation. BEVs promise to drastically reduce greenhouse gas emissions as a result of the transportation sector. However, mass adoption of BEVs faces major barriers due to consumer worries over several important battery-related issues, such as limited range, long charging time, lack of charging stations, and high initial cost. Existing solutions to overcome these barriers, such as building more charging stations, increasing battery capacity, and stationary vehicle-to-vehicle (V2V) charging, often suffer from prohibitive investment costs, incompatibility to existing BEVs, or long travel delays. In this paper, we propose P eer-to- P eer C ar C harging (P2C2), a scalable approach for charging BEVs that alleviates the need for elaborate charging infrastructure. The central idea is to enable BEVs to share charge among each other while in motion through coordination with a cloud-based control system. To re-vitalize a BEV fleet, which is continuously in motion, we introduce Mobile Charging Stations (MoCS), which are high-battery-capacity vehicles used to replenish the overall charge in a vehicle network. Unlike existing V2V charging solutions, the charge sharing in P2C2 takes place while the BEVs are in-motion, which aims at minimizing travel time loss. To reduce BEV-to-BEV contact time without increasing manufacturing costs, we propose to use multiple batteries of varying sizes and charge transfer rates. The faster but smaller batteries are used for charge transfer between vehicles, while the slower but larger ones are used for prolonged charge storage. We have designed the overall P2C2 framework and formalized the decision-making process of the cloud-based control system. We have evaluated the effectiveness of P2C2 using a well-characterized simulation platform and observed dramatic improvement in BEV mobility. Additionally, through statistical analysis, we show that a significant reduction in carbon emission is also possible if MoCS can be powered by renewable energy sources. 
    more » « less