skip to main content

Title: Reinforcement Learning-Based Home Energy Management System for Resiliency
With increase in the frequency of natural disasters such as hurricanes that disrupt the supply from the grid, there is a greater need for resiliency in electric supply. Rooftop solar photovoltaic (PV) panels along with batteries can provide resiliency to a house in a blackout due to a natural disaster. Our previous work showed that intelligence can reduce the size of a PV+battery system for the same level of post-blackout service compared to a conventional system that does not employ intelligent control. The intelligent controller proposed is based on model predictive control (MPC), which has two main challenges. One, it requires simple yet accurate models as it involves real-time optimization. Two, the discrete actuation for residential loads (on/off) makes the underlying optimization problem a mixed-integer program (MIP) which is challenging to solve. An attractive alternative to MPC is reinforcement learning (RL) as the real-time control computation is both model-free and simple. These points of interest accompany certain trade-offs; RL requires computationally expensive offline learning, and its performance is sensitive to various design choices. In this work, we propose an RL-based controller. We compare its performance with the MPC controller proposed in our prior work and a non-intelligent baseline controller. The more » RL controller is found to provide a resiliency performance — by commanding critical loads and batteries—similar to MPC with a significant reduction in computational effort. « less
; ; ;
Award ID(s):
1934322 1646229
Publication Date:
Journal Name:
American Control Conference
Page Range or eLocation-ID:
1358 to 1364
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract An autonomous adaptive model predictive control (MPC) architecture is presented for control of heating, ventilation, and air condition (HVAC) systems to maintain indoor temperature while reducing energy use. Although equipment use and occupant changes with time, existing MPC methods are not capable of automatically relearning models and computing control decisions reliably for extended periods without intervention from a human expert. We seek to address this weakness. Two major features are embedded in the proposed architecture to enable autonomy: (i) a system identification algorithm from our prior work that periodically re-learns building dynamics and unmeasured internal heat loads from data without requiring re-tuning by experts. The estimated model is guaranteed to be stable and has desirable physical properties irrespective of the data; (ii) an MPC planner with a convex approximation of the original nonconvex problem. The planner uses a descent and convergent method, with the underlying optimization problem being feasible and convex. A yearlong simulation with a realistic plant shows that both of the features of the proposed architecture—periodic model and disturbance update and convexification of the planning problem—are essential to get performance improvement over a commonly used baseline controller. Without these features, long-term energy savings from MPC can bemore »small while with them, the savings from MPC become substantial.« less
  2. Model predictive control (MPC) has drawn a considerable amount of attention in automotive applications during the last decade, partially due to its systematic capacity of treating system constraints. Even though having received broad acknowledgements, there still exist two intrinsic shortcomings on this optimization-based control strategy, namely the extensive online calculation burden and the complex tuning process, which hinder MPC from being applied to a wider extent. To tackle these two drawbacks, different methods were proposed. Nevertheless, the majority of these approaches treat these two issues independently. However, parameter tuning in fact has double-sided effects on both the controller performance and the real-time computational burden. Due to the lack of theoretical tools for globally analyzing the complex conflicts among MPC parameter tuning, controller performance optimization, and computational burden easement, a look-up table-based online parameter selection method is proposed in this paper to help a vehicle track its reference path under both the stability and computational capacity constraints. matlab-carsim conjoint simulations show the effectiveness of the proposed strategy.
  3. Abstract Flooding in coastal cities is increasing due to climate change and sea-level rise, stressing the traditional stormwater systems these communities rely on. Automated real-time control (RTC) of these systems can improve performance, and creating control policies for smart stormwater systems is an active area of study. This research explores reinforcement learning (RL) to create control policies to mitigate flood risk. RL is trained using a model of hypothetical urban catchments with a tidal boundary and two retention ponds with controllable valves. RL's performance is compared to the passive system, a model predictive control (MPC) strategy, and a rule-based control strategy (RBC). RL learns to proactively manage pond levels using current and forecast conditions and reduced flooding by 32% over the passive system. Compared to the MPC approach using a physics-based model and genetic algorithm, RL achieved nearly the same flood reduction, just 3% less than MPC, with a significant 88× speedup in runtime. Compared to RBC, RL was able to quickly learn similar control strategies and reduced flooding by an additional 19%. This research demonstrates that RL can effectively control a simple system and offers a computationally efficient method that could scale to RTC of more complex stormwater systems.
  4. Model predictive control (MPC) has become more relevant to vehicle dynamics control due to its inherent capacity of treating system constraints. However, online optimization from MPC introduces an extensive computational burden for today’s onboard microprocessors. To alleviate MPC computational load, several methods have been proposed. Among them, online successive system linearization and the resulting linear time-varying model predictive controller (LTVMPC) is one of the most popular options. Nevertheless, such online successive linearization commonly approximates the original (nonlinear) system by a linear one, which inevitably introduces extra modeling errors and therefore reduces MPC performance. Actually, if the controlled system possesses the “differential flatness” property, then it can be exactly linearized and an equivalent linear model will appear. This linear model maintains all the nonlinear features of the original system and can be utilized to design a flatness-based model predictive controller (FMPC). CarSim-Simulink joint simulations demonstrate that the proposed FMPC substantially outperforms a classical LTVMPC in terms of the path-tracking performance for autonomous vehicles.
  5. Low voltage microgrid systems are characterized by high sensitivity to both active and reactive power for voltage support. Also, the operational conditions of microgrids connected to active distribution systems are time-varying. Thus, the ideal controller to provide voltage support must be flexible enough to handle technical and operational constraints. This paper proposes a model predictive control (MPC) approach to provide dynamic voltage support using energy storage systems. This approach uses a simplified predictive model of the system along with operational constraints to solve an online finite-horizon optimization problem. Control signals are then computed such that the defined cost function is minimized. By proper selection of MPC weighting parameters, the quality of service provided can be adjusted to achieve the desired performance. A simulation study in Matlab/Simulink validates the proposed approach for a simplified version of a 100 kVA, 208 V microgrid using typical parameters. Results show that performance of the voltage support can be adjusted depending on the choice of weight and constraints of the controller.