skip to main content

Title: Uncertainty Matters: Bayesian Probabilistic Forecasting for Residential Smart Meter Prediction, Segmentation, and Behavioral Measurement and Verification
As new grid edge technologies emerge—such as rooftop solar panels, battery storage, and controllable water heaters—quantifying the uncertainties of building load forecasts is becoming more critical. The recent adoption of smart meter infrastructures provided new granular data streams, largely unavailable just ten years ago, that can be utilized to better forecast building-level demand. This paper uses Bayesian Structural Time Series for probabilistic load forecasting at the residential building level to capture uncertainties in forecasting. We use sub-hourly electrical submeter data from 120 residential apartments in Singapore that were part of a behavioral intervention study. The proposed model addresses several fundamental limitations through its flexibility to handle univariate and multivariate scenarios, perform feature selection, and include either static or dynamic effects, as well as its inherent applicability for measurement and verification. We highlight the benefits of this process in three main application areas: (1) Probabilistic Load Forecasting for Apartment-Level Hourly Loads; (2) Submeter Load Forecasting and Segmentation; (3) Measurement and Verification for Behavioral Demand Response. Results show the model achieves a similar performance to ARIMA, another popular time series model, when predicting individual apartment loads, and superior performance when predicting aggregate loads. Furthermore, we show that the model robustly captures uncertainties in the forecasts while providing interpretable results, indicating the importance of, for example, temperature data in its predictions. Finally, our estimates for a behavioral demand response program indicate that it achieved energy savings; however, the confidence interval provided by the probabilistic model is wide. Overall, this probabilistic forecasting model accurately measures uncertainties in forecasts and provides interpretable results that can support building managers and policymakers with the goal of reducing energy use.  more » « less
Award ID(s):
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Page Range / eLocation ID:
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Many coastal cities are facing frequent flooding from storm events that are made worse by sea level rise and climate change. The groundwater table level in these low relief coastal cities is an important, but often overlooked, factor in the recurrent flooding these locations face. Infiltration of stormwater and water intrusion due to tidal forcing can cause already shallow groundwater tables to quickly rise toward the land surface. This decreases available storage which increases runoff, stormwater system loads, and flooding. Groundwater table forecasts, which could help inform the modeling and management of coastal flooding, are generally unavailable. This study explores two machine learning models, Long Short-term Memory (LSTM) networks and Recurrent Neural Networks (RNN), to model and forecast groundwater table response to storm events in the flood prone coastal city of Norfolk, Virginia. To determine the effect of training data type on model accuracy, two types of datasets (i) the continuous time series and (ii) a dataset of only storm events, created from observed groundwater table, rainfall, and sea level data from 2010–2018 are used to train and test the models. Additionally, a real-time groundwater table forecasting scenario was carried out to compare the models’ abilities to predict groundwater table levels given forecast rainfall and sea level as input data. When modeling the groundwater table with observed data, LSTM networks were found to have more predictive skill than RNNs (root mean squared error (RMSE) of 0.09 m versus 0.14 m, respectively). The real-time forecast scenario showed that models trained only on storm event data outperformed models trained on the continuous time series data (RMSE of 0.07 m versus 0.66 m, respectively) and that LSTM outperformed RNN models. Because models trained with the continuous time series data had much higher RMSE values, they were not suitable for predicting the groundwater table in the real-time scenario when using forecast input data. These results demonstrate the first use of LSTM networks to create hourly forecasts of groundwater table in a coastal city and show they are well suited for creating operational forecasts in real-time. As groundwater table levels increase due to sea level rise, forecasts of groundwater table will become an increasingly valuable part of coastal flood modeling and management. 
    more » « less
  2. Hot temperatures drive excessive energy use for space-cooling in built environments. In a building, a system operator could save costs by making better decisions under the uncertainties associated with urban temperature and future energy demands. In this paper, we assess the impact of urban weather modeling on energy cost, using a value of information (VoI) analysis, in a day-ahead (DA) electricity market. To do that, we combine two probabilistic models: (a) a model for forecasting urban temperature and (b) a model for forecasting hourly net electric load of a building given ambient urban temperature. We then quantify the impact of better urban weather modeling by propagating the uncertainty from the temperature model to the load forecasting model. We perform a numerical case study on residential building prototypes located in the city of Pittsburgh. The result indicates that using a better weather model could save 4.34-8.22% of the electricity costs for space-cooling. 
    more » « less
  3. Short-term probabilistic forecasts of the trajectory of the COVID-19 pandemic in the United States have served as a visible and important communication channel between the scientific modeling community and both the general public and decision-makers. Forecasting models provide specific, quantitative, and evaluable predictions that inform short-term decisions such as healthcare staffing needs, school closures, and allocation of medical supplies. Starting in April 2020, the US COVID-19 Forecast Hub ( ) collected, disseminated, and synthesized tens of millions of specific predictions from more than 90 different academic, industry, and independent research groups. A multimodel ensemble forecast that combined predictions from dozens of groups every week provided the most consistently accurate probabilistic forecasts of incident deaths due to COVID-19 at the state and national level from April 2020 through October 2021. The performance of 27 individual models that submitted complete forecasts of COVID-19 deaths consistently throughout this year showed high variability in forecast skill across time, geospatial units, and forecast horizons. Two-thirds of the models evaluated showed better accuracy than a naïve baseline model. Forecast accuracy degraded as models made predictions further into the future, with probabilistic error at a 20-wk horizon three to five times larger than when predicting at a 1-wk horizon. This project underscores the role that collaboration and active coordination between governmental public-health agencies, academic modeling teams, and industry partners can play in developing modern modeling capabilities to support local, state, and federal response to outbreaks. 
    more » « less
  4. Abstract

    This study evaluates simulated radiance forecasts from a series of controlled experiments consisting of FV3‐LAM forecasts with different configurations of model physics and vertical resolution. The forecasts were produced during the 2020 Hazardous Weather Testbed Spring Forecasting Experiments on the same forecast cases. The evaluation includes grid‐point, neighborhood‐based and object‐based verification. The experiments include forecasts that were identical except for the physics (EMC‐LAM vs. EMC‐LAMx), vertical resolution (EMC‐LAMx vs. NSSL‐LAM), or combined initial conditions, physics and vertical resolution (GSL‐LAM). It is found that the EMC‐LAM generally provided better simulated radiance forecasts than the other three configurations at most forecast lead times, due to its unique physics configuration. All configurations generally over‐forecasted high level clouds. EMC‐LAM reduced the over‐forecasting of high clouds, but also under‐forecasted the coverage of mid‐level clouds. In contrast, at early lead times the EMC‐LAM had relatively poor performance relative to the other forecasts. Furthermore, EMC‐LAM was an outlier in terms of the vertical structure of clouds. It is also found that the NSSL‐LAM consistently improved upon the EMC‐LAMx, which had fewer vertical levels than NSSL‐LAM. Compared to EMC‐LAMx, NSSL‐LAM had less cloud over‐forecasting bias, especially with small cloud objects, and less overall error. The differences between EMC‐LAMx and GSL‐LAM were generally much smaller than the differences between EMC‐LAMx and EMC‐LAM/NSSL‐LAM. Finally, it is found that a non‐linear bias correction conditioned on symmetric brightness temperature reduced the overall root‐mean‐square error by about a factor of 2 while improving the unrealistic vertical structure of clouds in the EMC‐LAM.

    more » « less
  5. Antarctic sea ice prediction has garnered increasing attention in recent years, particularly in the context of the recent record lows of February 2022 and 2023. As Antarctica becomes a climate change hotspot, as polar tourism booms, and as scientific expeditions continue to explore this remote continent, the capacity to anticipate sea ice conditions weeks to months in advance is in increasing demand. Spurred by recent studies that uncovered physical mechanisms of Antarctic sea ice predictability and by the intriguing large variations of the observed sea ice extent in recent years, the Sea Ice Prediction Network South (SIPN South) project was initiated in 2017, building upon the Arctic Sea Ice Prediction Network. The SIPN South project annually coordinates spring-to-summer predictions of Antarctic sea ice conditions, to allow robust evaluation and intercomparison, and to guide future development in polar prediction systems. In this paper, we present and discuss the initial SIPN South results collected over six summer seasons (December-February 2017-2018 to 2022-2023). We use data from 22 unique contributors spanning five continents that have together delivered more than 3000 individual forecasts of sea ice area and concentration. The SIPN South median forecast of the circumpolar sea ice area captures the sign of the recent negative anomalies, and the verifying observations are systematically included in the 10-90% range of the forecast distribution. These statements also hold at the regional level except in the Ross Sea where the systematic biases and the ensemble spread are the largest. A notable finding is that the group forecast, constructed by aggregating the data provided by each contributor, outperforms most of the individual forecasts, both at the circumpolar and regional levels. This indicates the value of combining predictions to average out model-specific errors. Finally, we find that dynamical model predictions (i.e., based on process-based general circulation models) generally perform worse than statistical model predictions (i.e., data-driven empirical models including machine learning) in representing the regional variability of sea ice concentration in summer. SIPN South is a collaborative community project that is hosted on a shared public repository. The forecast and verification data used in SIPN South are publicly available in near-real time for further use by the polar research community, and eventually, policymakers. 
    more » « less