skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: A Human-Centered Power Conservation Framework Based on Reverse Auction Theory and Machine Learning
Extreme outside temperatures resulting from heat waves, winter storms, and similar weather-related events trigger the Heating Ventilation and Air Conditioning (HVAC) systems, resulting in challenging, and potentially catastrophic, peak loads. As a consequence, such extreme outside temperatures put a strain on power grids and may thus lead to blackouts. To avoid the financial and personal repercussions of peak loads, demand response and power conservation represent promising solutions. Despite numerous efforts, it has been shown that the current state-of-the-art fails to consider (1) the complexity of human behavior when interacting with power conservation systems and (2) realistic home-level power dynamics. As a consequence, this leads to approaches that are (1) ineffective due to poor long-term user engagement and (2) too abstract to be used in real-world settings. In this article, we propose an auction theory-based power conservation framework for HVAC designed to address such individual human component through a three-fold approach:personalized preferencesof power conservation,models of realistic user behavior, andrealistic home-level power dynamics. In our framework, the System Operator sends Load Serving Entities (LSEs) the required power saving to tackle peak loads at the residential distribution feeder. Each LSE then prompts its users to providebids, i.e.,personalized preferencesof thermostat temperature adjustments, along with corresponding financial compensations. We employmodels of realistic user behaviorby means of online surveys to gather user bids and evaluate user interaction with such system.Realistic home-level power dynamicsare implemented by our machine learning-based Power Saving Predictions (PSP) algorithm, calculating the individual power savings in each user’s home resulting from such bids. A machine learning-based PSPs algorithm is executed by the users’ Smart Energy Management System (SEMS). PSP translates temperature adjustments into the corresponding power savings. Then, the SEMS sends bids back to the LSE, which selects the auction winners through an optimization problem called POwer Conservation Optimization (POCO). We prove that POCO is NP-hard, and thus provide two approaches to solve this problem. One approach is an optimal pseudo-polynomial algorithm called DYnamic programming Power Saving (DYPS), while the second is a heuristic polynomial time algorithm called Greedy Ranking AllocatioN (GRAN). EnergyPlus, the high-fidelity and gold-standard energy simulator funded by the U.S. Department of Energy, was used to validate our experiments, as well as to collect data to train PSP. We further evaluate the results of the auctions across several scenarios, showing that, as expected, DYPS finds the optimal solution, while GRAN outperforms recent state-of-the-art approaches.  more » « less
Award ID(s):
1936131 1943035 2438581
PAR ID:
10545343
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
ACM Digital Library
Date Published:
Journal Name:
ACM Transactions on Cyber-Physical Systems
Volume:
8
Issue:
3
ISSN:
2378-962X
Page Range / eLocation ID:
1 to 26
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. This paper proposes Addax, a fast, verifiable, and private online ad exchange. When a user visits an ad-supported site, Addax runs an auction similar to those of leading exchanges; Addax requests bids, selects the winner, collects payment, and displays the ad to the user. A key distinction is that bids in Addax’s auctions are kept private and the outcome of the auction is publicly verifiable. Addax achieves these properties by adding public verifiability to the affine aggregatable encodings in Prio (NSDI’17) and by building an auction protocol out of them. Our implementation of Addax over WAN with hundreds of bidders can run roughly half the auctions per second as a non-private and non-verifiable exchange, while delivering ads to users in under 600 ms with little additional bandwidth requirements. This efficiency makes Addax the first architecture capable of bringing transparency to this otherwise opaque ecosystem. 
    more » « less
  2. We consider the problem of a single seller repeatedly selling a single item to a single buyer (specifically, the buyer has a value drawn fresh from known distribution $$D$$ in every round). Prior work assumes that the buyer is fully rational and will perfectly reason about how their bids today affect the seller's decisions tomorrow. In this work we initiate a different direction: the buyer simply runs a no-regret learning algorithm over possible bids. We provide a fairly complete characterization of optimal auctions for the seller in this domain. Specifically: 1) If the buyer bids according to EXP3 (or any ``mean-based'' learning algorithm), then the seller can extract expected revenue arbitrarily close to the expected welfare. This auction is independent of the buyer's valuation $$D$$, but somewhat unnatural as it is sometimes in the buyer's interest to overbid. 2) There exists a learning algorithm $$\mathcal{A}$$ such that if the buyer bids according to $$\mathcal{A}$$ then the optimal strategy for the seller is simply to post the Myerson reserve for $$D$$ every round. 3) If the buyer bids according to EXP3 (or any ``mean-based'' learning algorithm), but the seller is restricted to ``natural'' auction formats where overbidding is dominated (e.g. Generalized First-Price or Generalized Second-Price), then the optimal strategy for the seller is a pay-your-bid format with decreasing reserves over time. Moreover, the seller's optimal achievable revenue is characterized by a linear program, and can be unboundedly better than the best truthful auction yet simultaneously unboundedly worse than the expected welfare. 
    more » « less
  3. Reinforcement learning (RL) methods can be used to develop a controller for the heating, ventilation, and air conditioning (HVAC) systems that both saves energy and ensures high occupants’ thermal comfort levels. However, the existing works typically require on-policy data to train an RL agent, and the occupants’ personalized thermal preferences are not considered, which is limited in the real-world scenarios. This paper designs a high-performance model-based offline RL algorithm for personalized HVAC systems. The proposed algorithm can quickly adapt to different occupants’ thermal preferences with a few thermal feedbacks, guaranteeing the high occupants’ personalized thermal comfort levels efficiently. First, we use a meta-supervised learning algorithm to train an occupant's thermal preference model. Then, we train an ensemble neural network to predict the thermal states of the considered zone. In addition, the obtained ensemble networks can indicate the regions in the state and action spaces covered by the offline dataset. With the personalized thermal preference model updated via meta-testing, model-based RL is used to derive the optimal HVAC controller. Since the proposed algorithm only requires offline datasets and a few online thermal feedbacks for training, it contributes to a more practical deployment of the RL algorithm to HVAC systems. We use the ASHRAE database II to verify the effectiveness and advantage of the meta-learning algorithm for modeling different occupants’ thermal preferences. Numerical simulations on the EnergyPlus environment demonstrate that the proposed algorithm can guarantee personalized thermal preferences with a slight increase of power consumption of 1.91% compared with the model-based RL algorithm with on-policy data aggregation. 
    more » « less
  4. null (Ed.)
    We identify the first static credible mechanism for multi-item additive auctions that achieves a constant factor of the optimal revenue. This is one instance of a more general framework for designing two-part tariff auctions, adapting the duality framework of Cai et al [CDW16]. Given a (not necessarily incentive compatible) auction format A satisfying certain technical conditions, our framework augments the auction with a personalized entry fee for each bidder, which must be paid before the auction can be accessed. These entry fees depend only on the prior distribution of bidder types, and in particular are independent of realized bids. Our framework can be used with many common auction formats, such as simultaneous first-price, simultaneous second-price, and simultaneous all-pay auctions. If all-pay auctions are used, we prove that the resulting mechanism is credible in the sense that the auctioneer cannot benefit by deviating from the stated mechanism after observing agent bids. If second-price auctions are used, we obtain a truthful O(1)-approximate mechanism with fixed entry fees that are amenable to tuning via online learning techniques. Our results for first price and all-pay are the first revenue guarantees of non-truthful mechanisms in multi-dimensional environments; an open question in the literature [RST17]. 
    more » « less
  5. null (Ed.)
    Abstract An autonomous adaptive model predictive control (MPC) architecture is presented for control of heating, ventilation, and air condition (HVAC) systems to maintain indoor temperature while reducing energy use. Although equipment use and occupant changes with time, existing MPC methods are not capable of automatically relearning models and computing control decisions reliably for extended periods without intervention from a human expert. We seek to address this weakness. Two major features are embedded in the proposed architecture to enable autonomy: (i) a system identification algorithm from our prior work that periodically re-learns building dynamics and unmeasured internal heat loads from data without requiring re-tuning by experts. The estimated model is guaranteed to be stable and has desirable physical properties irrespective of the data; (ii) an MPC planner with a convex approximation of the original nonconvex problem. The planner uses a descent and convergent method, with the underlying optimization problem being feasible and convex. A yearlong simulation with a realistic plant shows that both of the features of the proposed architecture—periodic model and disturbance update and convexification of the planning problem—are essential to get performance improvement over a commonly used baseline controller. Without these features, long-term energy savings from MPC can be small while with them, the savings from MPC become substantial. 
    more » « less