skip to main content


Title: See the Light: Modeling Solar Performance using Multispectral Satellite Data
Developing accurate solar performance models, which infer solar output from widely available external data sources, is increasingly important as the grid's solar capacity rises. These models are important for a wide range of solar analytics, including solar forecasting, resource estimation, and fault detection. The most significant error in existing models is inaccurate estimates of clouds' effect on solar output, as cloud formations and their impact on solar radiation are highly complex. In 2018 and 2019, respectively, the National Oceanic and Atmospheric Administration (NOAA) in the U.S. began releasing multispectral data comprising 16 different light wavelengths (or channels) from the GOES-16 and GOES-17 satellites every 5 minutes. Enough channel data is now available to learn solar performance models using machine learning (ML). In this paper, we show how to develop both local and global solar performance models using ML on multispectral data, and compare their accuracy to existing physical models based on ground-level weather readings and on NOAA's estimates of downward shortwave radiation (DSR), which also derive from multispectral data but using a physical model. We show that ML-based solar performance models based on multispectral data are much more accurate than weather- or DSR-based models, improving the average MAPE across 29 solar sites by over 50% for local models and 25% for global models.  more » « less
Award ID(s):
1645952
NSF-PAR ID:
10298239
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Proceedings of the 7th ACM International Conference on Systems for Energy-Efficient Buildings, Cities, and Transportation (BuildSys)
Page Range / eLocation ID:
1 to 10
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Developing accurate solar performance models, which infer solar power output in real time based on the current environmental conditions, are an important prerequisite for many advanced energy analytics. Recent work has developed sophisticated data-driven techniques that generate customized models for complex rooftop solar sites by combining well-known physical models with both system and public weather station data. However, inferring solar generation from public weather station data has two drawbacks: not all solar sites are near a public weather station, and public weather data generally quantifies cloud cover-the most significant weather metric that affects solar-using highly coarse and imprecise measurements.In this paper, we develop and evaluate solar performance models that use satellite-based estimates of downward shortwave (solar) radiation (DSR) at the Earth's surface, which NOAA began publicly releasing after the launch of the GOES-R geostationary satellites in 2017. Unlike public weather data, DSR estimates are available for every 0.5km 2 area. As we show, the accuracy of solar performance modeling using satellite data and public weather station data depends on the cloud conditions, with DSR-based modeling being more accurate under clear skies and station-based modeling being more accurate under overcast skies. Surprisingly, our results show that, overall, pure satellite-based modeling yields similar accuracy as pure station-based modeling, although the relationship is a function of conditions and the local climate. We also show that a hybrid approach that combines the best of both approaches can also modestly improve accuracy. 
    more » « less
  2. Solar energy is now the cheapest form of electricity in history. Unfortunately, significantly increasing the electric grid's fraction of solar energy remains challenging due to its variability, which makes balancing electricity's supply and demand more difficult. While thermal generators' ramp rate---the maximum rate at which they can change their energy generation---is finite, solar energy's ramp rate is essentially infinite. Thus, accurate near-term solar forecasting, or nowcasting, is important to provide advance warnings to adjust thermal generator output in response to variations in solar generation to ensure a balanced supply and demand. To address the problem, this paper develops a general model for solar nowcasting from abundant and readily available multispectral satellite data using self-supervised learning. Specifically, we develop deep auto-regressive models using convolutional neural networks (CNN) and long short-term memory networks (LSTM) that are globally trained across multiple locations to predict raw future observations of the spatio-temporal spectral data collected by the recently launched GOES-R series of satellites. Our model estimates a location's near-term future solar irradiance based on satellite observations, which we feed to a regression model trained on smaller site-specific solar data to provide near-term solar photovoltaic (PV) forecasts that account for site-specific characteristics. We evaluate our approach for different coverage areas and forecast horizons across 25 solar sites and show that it yields errors close to that of a model using ground-truth observations. 
    more » « less
  3. Abstract

    Machine learning (ML) has been applied to space weather problems with increasing frequency in recent years, driven by an influx of in-situ measurements and a desire to improve modeling and forecasting capabilities throughout the field. Space weather originates from solar perturbations and is comprised of the resulting complex variations they cause within the numerous systems between the Sun and Earth. These systems are often tightly coupled and not well understood. This creates a need for skillful models with knowledge about the confidence of their predictions. One example of such a dynamical system highly impacted by space weather is the thermosphere, the neutral region of Earth’s upper atmosphere. Our inability to forecast it has severe repercussions in the context of satellite drag and computation of probability of collision between two space objects in low Earth orbit (LEO) for decision making in space operations. Even with (assumed) perfect forecast of model drivers, our incomplete knowledge of the system results in often inaccurate thermospheric neutral mass density predictions. Continuing efforts are being made to improve model accuracy, but density models rarely provide estimates of confidence in predictions. In this work, we propose two techniques to develop nonlinear ML regression models to predict thermospheric density while providing robust and reliable uncertainty estimates: Monte Carlo (MC) dropout and direct prediction of the probability distribution, both using the negative logarithm of predictive density (NLPD) loss function. We show the performance capabilities for models trained on both local and global datasets. We show that the NLPD loss provides similar results for both techniques but the direct probability distribution prediction method has a much lower computational cost. For the global model regressed on the Space Environment Technologies High Accuracy Satellite Drag Model (HASDM) density database, we achieve errors of approximately 11% on independent test data with well-calibrated uncertainty estimates. Using an in-situ CHAllenging Minisatellite Payload (CHAMP) density dataset, models developed using both techniques provide test error on the order of 13%. The CHAMP models—on validation and test data—are within 2% of perfect calibration for the twenty prediction intervals tested. We show that this model can also be used to obtain global density predictions with uncertainties at a given epoch.

     
    more » « less
  4. null (Ed.)
    Abstract Measuring soil health indicators (SHIs), particularly soil total nitrogen (TN), is an important and challenging task that affects farmers’ decisions on timing, placement, and quantity of fertilizers applied in the farms. Most existing methods to measure SHIs are in-lab wet chemistry or spectroscopy-based methods, which require significant human input and effort, time-consuming, costly, and are low-throughput in nature. To address this challenge, we develop an artificial intelligence (AI)-driven near real-time unmanned aerial vehicle (UAV)-based multispectral sensing solution (UMS) to estimate soil TN in an agricultural farm. TN is an important macro-nutrient or SHI that directly affects the crop health. Accurate prediction of soil TN can significantly increase crop yield through informed decision making on the timing of seed planting, and fertilizer quantity and timing. The ground-truth data required to train the AI approaches is generated via laser-induced breakdown spectroscopy (LIBS), which can be readily used to characterize soil samples, providing rapid chemical analysis of the samples and their constituents (e.g., nitrogen, potassium, phosphorus, calcium). Although LIBS was previously applied for soil nutrient detection, there is no existing study on the integration of LIBS with UAV multispectral imaging and AI. We train two machine learning (ML) models including multi-layer perceptron regression and support vector regression to predict the soil nitrogen using a suite of data classes including multispectral characteristics of the soil and crops in red (R), near-infrared, and green (G) spectral bands, computed vegetation indices (NDVI), and environmental variables including air temperature and relative humidity (RH). To generate the ground-truth data or the training data for the machine learning models, we determine the N spectrum of the soil samples (collected from a farm) using LIBS and develop a calibration model using the correlation between actual TN of the soil samples and the maximum intensity of N spectrum. In addition, we extract the features from the multispectral images captured while the UAV follows an autonomous flight plan, at different growth stages of the crops. The ML model’s performance is tested on a fixed configuration space for the hyper-parameters using various hyper-parameter optimization techniques at three different wavelengths of the N spectrum. 
    more » « less
  5. Supervised Machine Learning (ML) models for solar flare prediction rely on accurate labels for a given input data set, commonly obtained from the GOES/XRS X-ray flare catalog. With increasing interest in utilizing ultraviolet (UV) and extreme ultraviolet (EUV) image data as input to these models, we seek to understand if flaring activity can be defined and quantified using EUV data alone. This would allow us to move away from the GOES single pixel measurement definition of flares and use the same data we use for flare prediction for label creation. In this work, we present a Solar Dynamics Observatory (SDO) Atmospheric Imaging Assembly (AIA)-based flare catalog covering flare of GOES X-ray magnitudes C, M and X from 2010 to 2017. We use active region (AR) cutouts of full disk AIA images to match the corresponding SDO/Helioseismic and Magnetic Imager (HMI) SHARPS (Space weather HMI Active Region Patches) that have been extensively used in ML flare prediction studies, thus allowing for labeling of AR number as well as flare magnitude and timing. Flare start, peak, and end times are defined using a peak-finding algorithm on AIA time series data obtained by summing the intensity across the AIA cutouts. An extremely randomized trees (ERT) regression model is used to map SDO/AIA flare magnitudes to GOES X-ray magnitude, achieving a low-variance regression. We find an accurate overlap on 85% of M/X flares between our resulting AIA catalog and the GOES flare catalog. However, we also discover a number of large flares unrecorded or mislabeled in the GOES catalog.

     
    more » « less