Building heating, ventilation, and air conditioning (HVAC) systems account for nearly half of building energy consumption and
- Award ID(s):
- 2243931
- PAR ID:
- 10536472
- Publisher / Repository:
- 2024 ASHRAE Winter Conference
- Date Published:
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
Abstract of total energy consumption in the US. Their operation is also crucial for ensuring the physical and mental health of building occupants. Compared with traditional model-based HVAC control methods, the recent model-free deep reinforcement learning (DRL) based methods have shown good performance while do not require the development of detailed and costly physical models. However, these model-free DRL approaches often suffer from long training time to reach a good performance, which is a major obstacle for their practical deployment. In this work, we present a systematic approach to accelerate online reinforcement learning for HVAC control by taking full advantage of the$$20\%$$ knowledge from domain experts in various forms . Specifically, the algorithm stages include learning expert functions from existing abstract physical models and from historical data via offline reinforcement learning, integrating the expert functions with rule-based guidelines, conducting training guided by the integrated expert function and performing policy initialization from distilled expert function. Moreover, to ensure that the learned DRL-based HVAC controller can effectively keep room temperature within the comfortable range for occupants, we design a runtime shielding framework to reduce the temperature violation rate and incorporate the learned controller into it. Experimental results demonstrate up to 8.8X speedup in DRL training from our approach over previous methods, with low temperature violation rate. -
In recent years, the focus has been on enhancing user comfort in commercial buildings while cutting energy costs. Efforts have mainly centered on improving HVAC systems, the central control system. However, it’s evident that HVAC alone can’t ensure occupant comfort. Lighting, blinds, and windows, often overlooked, also impact energy use and comfort. This paper introduces a holistic approach to managing the delicate balance between energy efficiency and occupant comfort in commercial buildings. We present
OCTOPUS , a system employing a deep reinforcement learning (DRL) framework using data-driven techniques to optimize control sequences for all building subsystems, including HVAC, lighting, blinds, and windows.OCTOPUS ’s DRL architecture features a unique reward function facilitating the exploration of tradeoffs between energy usage and user comfort, effectively addressing the high-dimensional control problem resulting from interactions among these four building subsystems. To meet data training requirements, we emphasize the importance of calibrated simulations that closely replicate target-building operational conditions. We trainOCTOPUS using 10-year weather data and a calibrated building model in the EnergyPlus simulator. Extensive simulations demonstrate thatOCTOPUS achieves substantial energy savings, outperforming state-of-the-art rule-based and DRL-based methods by 14.26% and 8.1%, respectively, in a LEED Gold Certified building while maintaining desired human comfort levels. -
Safe operations of autonomous mobile robots in close proximity to humans, creates a need for enhanced trajectory tracking (with low tracking errors). Linear optimal control techniques such as Linear Quadratic Regulator (LQR) and Model Predictive Control (MPC) have been used successfully for low-speed applications while leveraging their model-based methodology with manageable computational demands. However, model and parameter uncertainties or other unmodeled nonlinearities may cause poor control actions and constraint violations. Nonlinear MPC has emerged as an alternate optimal-control approach but needs to overcome real-time deployment challenges (including fast sampling time, design complexity, and limited computational resources). In recent years, the optimal control-based deployments have benefitted enormously from the ability of Deep Neural Networks (DNNs) to serve as universal function approximators. This has led to deployments in a plethora of previously inaccessible applications – but many aspects of generalizability, benchmarking, and systematic verification and validation coupled with benchmarking have emerged. This paper presents a novel approach to fusing Deep Reinforcement Learning-based (DRL) longitudinal control with a traditional PID lateral controller for autonomous navigation. Our approach follows (i) Generation of an adequate fidelity simulation scenario via a Real2Sim approach; (ii) training a DRL agent within this framework; (iii) Testing the performance and generalizability on alternate scenarios. We use an initial tuned set of the lateral PID controller gains for observing the vehicle response over a range of velocities. Then we use a DRL framework to generate policies for an optimal longitudinal controller that successfully complements the lateral PID to give the best tracking performance for the vehicle.more » « less
-
Reinforcement learning (RL) methods can be used to develop a controller for the heating, ventilation, and air conditioning (HVAC) systems that both saves energy and ensures high occupants’ thermal comfort levels. However, the existing works typically require on-policy data to train an RL agent, and the occupants’ personalized thermal preferences are not considered, which is limited in the real-world scenarios. This paper designs a high-performance model-based offline RL algorithm for personalized HVAC systems. The proposed algorithm can quickly adapt to different occupants’ thermal preferences with a few thermal feedbacks, guaranteeing the high occupants’ personalized thermal comfort levels efficiently. First, we use a meta-supervised learning algorithm to train an occupant's thermal preference model. Then, we train an ensemble neural network to predict the thermal states of the considered zone. In addition, the obtained ensemble networks can indicate the regions in the state and action spaces covered by the offline dataset. With the personalized thermal preference model updated via meta-testing, model-based RL is used to derive the optimal HVAC controller. Since the proposed algorithm only requires offline datasets and a few online thermal feedbacks for training, it contributes to a more practical deployment of the RL algorithm to HVAC systems. We use the ASHRAE database II to verify the effectiveness and advantage of the meta-learning algorithm for modeling different occupants’ thermal preferences. Numerical simulations on the EnergyPlus environment demonstrate that the proposed algorithm can guarantee personalized thermal preferences with a slight increase of power consumption of 1.91% compared with the model-based RL algorithm with on-policy data aggregation.more » « less
-
Stop-and-go traffic poses significant challenges to the efficiency and safety of traffic operations, and its impacts and working mechanism have attracted much attention. Recent studies have shown that Connected and Automated Vehicles (CAVs) with carefully designed longitudinal control have the potential to dampen the stop-and-go wave based on simulated vehicle trajectories. In this study, Deep Reinforcement Learning (DRL) is adopted to control the longitudinal behavior of CAVs and real-world vehicle trajectory data is utilized to train the DRL controller. It considers a Human-Driven (HD) vehicle tailed by a CAV, which are then followed by a platoon of HD vehicles. Such an experimental design is to test how the CAV can help to dampen the stop-and-go wave generated by the lead HD vehicle and contribute to smoothing the following HD vehicles’ speed profiles. The DRL control is trained using real-world vehicle trajectories, and eventually evaluated using SUMO simulation. The results show that the DRL control decreases the speed oscillation of the CAV by 54% and 8%-28% for those following HD vehicles. Significant fuel consumption savings are also observed. Additionally, the results suggest that CAVs may act as a traffic stabilizer if they choose to behave slightly altruistically.more » « less