skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: 2025 PG&E Energy Analytics Challenge: Electric Load Forecasting Datasets
Summary:   This repository contains the datasets used in the 2025 PG&E Energy Analytics Challenge on Electricity Load Forecasting, co-organized by the IISE Energy Systems Division and the INFORMS Quality, Statistics, and Reliability Section, and graciously sponsored by the Pacific Gas & Electric Company (PG&E).  The repository contains two files: Train.xlsx: Contains three years of hourly electric loads from San Diego, California (Years 2020-2022), as well as exogenous weather information at five neighboring sites within the San Diego area.    Test.xlsx: Contains one year of hourly electric loads from San Diego, California (Year 2023), as as well as exogenous weather information at five neighboring sites within the San Diego area.    Background:  This competition aimed to predict electricity loads for a specific location within the CAISO system. Accurate load forecasting is critical for managing electricity distribution within California’s diverse and dynamic energy market. Load patterns can vary significantly due to factors such as weather conditions, local supply and demand, and the mix of nearby energy generation sources.   During the competition, the specific location and the actual years were not disclosed to the participants. Participants were then asked to generate a year’s worth of load forecasts using historical load values and exogeneous weather information. Predictor data were not allowed to be taken from future time periods: When predicting the load values at a particular day, the model was allowed to only use predictor variable information from that time period or before. For example, if a team is predicting the load for Day 5 Hour 3 in Year 3, the model can only take in predictor values from Day 5 Hour 3 in Year 3, or before. Furthermore, during the competition, the load information for the test set (highlighted in yellow in the test data file) was reserved from all participants, who were asked to submit their predictions for the full year.    This is the second offering of the PG&E Energy Analytics challenge, complementing the first offering in 2024 which focused on electricity price forecasting [Results of the 2024 challenge: Aziz Ezzat, A., Mansouri, M., Yildirim, M., & Fang, X. (2025). IISE PG&E Energy Analytics Challenge 2024: Forecasting day-ahead electricity prices. IISE Transactions, 1–13. https://doi.org/10.1080/24725854.2024.2447049].   more » « less
Award ID(s):
2114425
PAR ID:
10646817
Author(s) / Creator(s):
; ; ; ; ;
Publisher / Repository:
Zenodo
Date Published:
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Hot temperatures drive excessive energy use for space-cooling in built environments. In a building, a system operator could save costs by making better decisions under the uncertainties associated with urban temperature and future energy demands. In this paper, we assess the impact of urban weather modeling on energy cost, using a value of information (VoI) analysis, in a day-ahead (DA) electricity market. To do that, we combine two probabilistic models: (a) a model for forecasting urban temperature and (b) a model for forecasting hourly net electric load of a building given ambient urban temperature. We then quantify the impact of better urban weather modeling by propagating the uncertainty from the temperature model to the load forecasting model. We perform a numerical case study on residential building prototypes located in the city of Pittsburgh. The result indicates that using a better weather model could save 4.34-8.22% of the electricity costs for space-cooling. 
    more » « less
  2. Electric load forecasting refers to forecasting the electricity demand at aggregated levels. Utilities use the predictions of this technique to keep a balance between electricity generation and consumption at each time and make accurate decision for power system planning, operations, and maintenance, etc. Based on prediction time horizon, electric load forecasting is classified to very short-term, short-term, medium-term, and long-term. In this paper, a multiple output Gaussian processes with multiple kernel learning is proposed to predict short-term electric load forecasting (predicting 24 load values for the next day) based on load, temperature, and dew point values of previous days. Mean absolute percentage error (MAPE) is used as a measure of prediction accuracy. By comparing MAPE values of the proposed method with the persistence method, it can been seen that the proposed method improves the persistence method MAPE up to 4%. 
    more » « less
  3. Abstract: Load forecasting plays a very crucial role in many aspects of electric power systems including the economic and social benefits. Previously, there have been many studies involving load forecasting using time series approach, including weather-load relationships. In one such approach to predict load, this paper investigates through different structures that aim to relate various daily parameters. These parameters include temperature, humidity and solar radiation that comprises the weather data. Along with natural phenomenon as weather, physical aspects such as traffic flow are also considered. Based on the relationship, a prediction algorithm is applied to check if prediction error decreases when such external factors are considered. Electricity consumption data is collected from the City of Tallahassee utilities. Traffic count is provided by the Florida Department of Transportation. Moreover, the weather data is obtained from Tallahassee regional Airport weather station. This paper aims to study and establish a cause and effect relationship between the mentioned variables using different causality models and to forecast load based on the external variables. Based on the relationship, a prediction algorithm is applied to check if prediction error decreases when such external factors are considered. 
    more » « less
  4. Many Alaska communities rely on heating oil for heat and diesel fuel for electricity. For remote communities, fuel must be barged or flown in, leading to high costs. While renewable energy resources may be available, the variability of wind and solar energy limits the amount that can be used coincidentally without adequate storage. This study developed a decision-making method to evaluate beneficial matches between excess renewable generation and non-electric dispatchable loads, specifically heat loads such as space heating, water heating and treatment, and clothes drying in three partner communities. Hybrid Optimization Model for Multiple Electric Renewables (HOMER) Pro was used to model potential excess renewable generation based on current generation infrastructure, renewable resource data, and community load. The method then used these excess generation profiles to quantify how closely they align with modeled or actual heat loads, which have inherent thermal storage capacity. Of 236 possible combinations of solar and wind capacity investigated in the three communities, the best matches were seen between excess electricity from high-penetration wind generation and heat loads for clothes drying and space heating. The worst matches from this study were from low penetrations of solar (25% of peak load) with all heat loads. 
    more » « less
  5. Electricity markets are cleared by a two-stage, sequential process consisting of a forward (day-ahead) market and a spot (real-time) market. While their design goal is to achieve efficiency, the lack of sufficient competition introduces many opportunities for price manipulation. To discourage this phenomenon, some Independent System Operators (ISOs) mandate generators to submit (approximately) truthful bids in the day-ahead market. However, without fully accounting for all participants' incentives (generators and loads), the application of such a mandate may lead to unintended consequences. In this paper, we model and study the interactions of generators and inelastic loads in a two-stage settlement where generators are required to bid truthfully in the day-ahead market. We show that such mandate, when accounting for generator and load incentives, leads to a {generalized} Stackelberg-Nash game where load decisions (leaders) are performed in day-ahead market and generator decisions (followers) are relegated to the real-time market. Furthermore, the use of conventional supply function bidding for generators in real-time, does not guarantee the existence of a Nash equilibrium. This motivates the use of intercept bidding, as an alternative bidding mechanism for generators in the real-time market. An equilibrium analysis in this setting, leads to a closed-form solution that unveils several insights. Particularly, it shows that, unlike standard two-stage markets, loads are the winners of the competition in the sense that their aggregate payments are less than that of the competitive equilibrium. Moreover, heterogeneity in generators cost has the unintended effect of mitigating loads market power. Numerical studies validate and further illustrate these insights. 
    more » « less