skip to main content

Title: Smart-meter big data for load forecasting: An alternative approach to clustering
Accurate forecasting of electricity demand is vital to the resilient management of energy systems. Recent efforts in harnessing smart-meter data to improve forecasting accuracy have primarily centered around cluster-based approaches (CBAs), where smart-meter data are grouped into a small number of clusters and separate prediction models are developed for each cluster. The cluster-based predictions are then aggregated to compute the total demand. CBAs have provided promising results compared to conventional approaches that are generally not conducive to integrating smart-meter data. However, CBAs are computationally costly and suffer from the curse of dimensionality, especially under scenarios involving smart-meter data from millions of customers. In this work, we propose an efficient reduced model approach (RMA) that leverages a novel hierarchical dimension reduction algorithm to enable the integration of fine-resolution high-dimensional smart-meter data for millions of customers in load prediction. We demonstrate the applicability of our proposed approach by using data from a utility company, based in Illinois, United States, with more than 3.7 million customers and present model performance in-terms of forecast accuracy. The proposed hierarchical dimension reduction approach enables utilizing the high-resolution data from smart- meters in a scalable manner that is not exploitable otherwise. The results shows significant improvements in more » forecast accuracy compared to the available approaches that either do not harness fine-resolution data or are not scalable to large-scale smart-meter big data. « less
; ; ;
Award ID(s):
1826161 1826155
Publication Date:
Journal Name:
IEEE access
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Forecasting the El Niño-Southern Oscillation (ENSO) has been a subject of vigorous research due to the important role of the phenomenon in climate dynamics and its worldwide socioeconomic impacts. Over the past decades, numerous models for ENSO prediction have been developed, among which statistical models approximating ENSO evolution by linear dynamics have received significant attention owing to their simplicity and comparable forecast skill to first-principles models at short lead times. Yet, due to highly nonlinear and chaotic dynamics (particularly during ENSO initiation), such models have limited skill for longer-term forecasts beyond half a year. To resolve this limitation, here we employ a new nonparametric statistical approach based on analog forecasting, called kernel analog forecasting (KAF), which avoids assumptions on the underlying dynamics through the use of nonlinear kernel methods for machine learning and dimension reduction of high-dimensional datasets. Through a rigorous connection with Koopman operator theory for dynamical systems, KAF yields statistically optimal predictions of future ENSO states as conditional expectations, given noisy and potentially incomplete data at forecast initialization. Here, using industrial-era Indo-Pacific sea surface temperature (SST) as training data, the method is shown to successfully predict the Niño 3.4 index in a 1998–2017 verification period out tomore »a 10-month lead, which corresponds to an increase of 3–8 months (depending on the decade) over a benchmark linear inverse model (LIM), while significantly improving upon the ENSO predictability “spring barrier”. In particular, KAF successfully predicts the historic 2015/16 El Niño at initialization times as early as June 2015, which is comparable to the skill of current dynamical models. An analysis of a 1300-yr control integration of a comprehensive climate model (CCSM4) further demonstrates that the enhanced predictability afforded by KAF holds over potentially much longer leads, extending to 24 months versus 18 months in the benchmark LIM. Probabilistic forecasts for the occurrence of El Niño/La Niña events are also performed and assessed via information-theoretic metrics, showing an improvement of skill over LIM approaches, thus opening an avenue for environmental risk assessment relevant in a variety of contexts.

    « less
  2. As new grid edge technologies emerge—such as rooftop solar panels, battery storage, and controllable water heaters—quantifying the uncertainties of building load forecasts is becoming more critical. The recent adoption of smart meter infrastructures provided new granular data streams, largely unavailable just ten years ago, that can be utilized to better forecast building-level demand. This paper uses Bayesian Structural Time Series for probabilistic load forecasting at the residential building level to capture uncertainties in forecasting. We use sub-hourly electrical submeter data from 120 residential apartments in Singapore that were part of a behavioral intervention study. The proposed model addresses several fundamental limitations through its flexibility to handle univariate and multivariate scenarios, perform feature selection, and include either static or dynamic effects, as well as its inherent applicability for measurement and verification. We highlight the benefits of this process in three main application areas: (1) Probabilistic Load Forecasting for Apartment-Level Hourly Loads; (2) Submeter Load Forecasting and Segmentation; (3) Measurement and Verification for Behavioral Demand Response. Results show the model achieves a similar performance to ARIMA, another popular time series model, when predicting individual apartment loads, and superior performance when predicting aggregate loads. Furthermore, we show that the model robustly captures uncertaintiesmore »in the forecasts while providing interpretable results, indicating the importance of, for example, temperature data in its predictions. Finally, our estimates for a behavioral demand response program indicate that it achieved energy savings; however, the confidence interval provided by the probabilistic model is wide. Overall, this probabilistic forecasting model accurately measures uncertainties in forecasts and provides interpretable results that can support building managers and policymakers with the goal of reducing energy use.« less
  3. With the acceleration of ICT technologies and the Internet of Things (IoT) paradigm, smart residential environments , also known as smart homes are becoming increasingly common. These environments have significant potential for the development of intelligent energy management systems, and have therefore attracted significant attention from both academia and industry. An enabling building block for these systems is the ability of obtaining energy consumption at the appliance-level. This information is usually inferred from electric signals data (e.g., current) collected by a smart meter or a smart outlet, a problem known as appliance recognition . Several previous approaches for appliance recognition have proposed load disaggregation techniques for smart meter data. However, these approaches are often very inaccurate for low consumption and multi-state appliances. Recently, Machine Learning (ML) techniques have been proposed for appliance recognition. These approaches are mainly based on passive MLs, thus requiring pre-labeled data to be trained. This makes such approaches unable to rapidly adapt to the constantly changing availability and heterogeneity of appliances on the market. In a home setting scenario, it is natural to consider the involvement of users in the labeling process, as appliances’ electric signatures are collected. This type of learning falls into the categorymore »of Stream-based Active Learning (SAL). SAL has been mainly investigated assuming the presence of an expert , always available and willing to label the collected samples. Nevertheless, a home user may lack such availability, and in general present a more erratic and user-dependent behavior. In this paper, we develop a SAL algorithm, called K -Active-Neighbors (KAN), for the problem of household appliance recognition. Differently from previous approaches, KAN jointly learns the user behavior and the appliance signatures. KAN dynamically adjusts the querying strategy to increase accuracy by considering the user availability as well as the quality of the collected signatures. Such quality is defined as a combination of informativeness , representativeness , and confidence score of the signature compared to the current knowledge. To test KAN versus state-of-the-art approaches, we use real appliance data collected by a low-cost Arduino-based smart outlet as well as the ECO smart home dataset. Furthermore, we use a real dataset to model user behavior. Results show that KAN is able to achieve high accuracy with minimal data, i.e., signatures of short length and collected at low frequency.« less
  4. According to the National Academies, a week long forecast of velocity, vertical structure, and duration of the Loop Current (LC) and its eddies at a given location is a critical step toward understanding their effects on the gulf ecosystems as well as toward anticipating and mitigating the outcomes of anthropogenic and natural disasters in the Gulf of Mexico (GoM). However, creating such a forecast has remained a challenging problem since LC behavior is dominated by dynamic processes across multiple time and spatial scales not resolved at once by conventional numerical models. In this paper, building on the foundation of spatiotemporal predictive learning in video prediction, we develop a physics informed deep learning based prediction model called—Physics-informed Tensor-train ConvLSTM (PITT-ConvLSTM)—for forecasting 3D geo-spatiotemporal sequences. Specifically, we propose (1) a novel 4D higher-order recurrent neural network with empirical orthogonal function analysis to capture the hidden uncorrelated patterns of each hierarchy, (2) a convolutional tensor-train decomposition to capture higher-order space-time correlations, and (3) a mechanism that incorporates prior physics from domain experts by informing the learning in latent space. The advantage of our proposed approach is clear: constrained by the law of physics, the prediction model simultaneously learns good representations for frame dependenciesmore »(both short-term and long-term high-level dependency) and inter-hierarchical relations within each time frame. Experiments on geo-spatiotemporal data collected from the GoM demonstrate that the PITT-ConvLSTM model can successfully forecast the volumetric velocity of the LC and its eddies for a period greater than 1 week.« less
  5. Abstract The ensemble Kalman filter (EnKF) is a popular technique for data assimilation in high-dimensional nonlinear state-space models. The EnKF represents distributions of interest by an ensemble, which is a form of dimension reduction that enables straightforward forecasting even for complicated and expensive evolution operators. However, the EnKF update step involves estimation of the forecast covariance matrix based on the (often small) ensemble, which requires regularization. Many existing regularization techniques rely on spatial localization, which may ignore long-range dependence. Instead, our proposed approach assumes a sparse Cholesky factor of the inverse covariance matrix, and the nonzero Cholesky entries are further regularized. The resulting method is highly flexible and computationally scalable. In our numerical experiments, our approach was more accurate and less sensitive to misspecification of tuning parameters than tapering-based localization.