skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Data‐Driven Forecasting of Low‐Latitude Ionospheric Total Electron Content Using the Random Forest and LSTM Machine Learning Methods
Abstract In this research, we present data‐driven forecasting of ionospheric total electron content (TEC) using the Long‐Short Term Memory (LSTM) deep recurrent neural network method. The random forest machine learning method was used to perform a regression analysis and estimate the variable importance of the input parameters. The input data are obtained from satellite and ground based measurements characterizing the solar‐terrestrial environment. We estimate the relative importance of 34 different parameters, including the solar flux, solar wind density, and speed the three components of interplanetary magnetic field, Lyman‐alpha, the Kp, Dst, and Polar Cap (PC) indices. The TEC measurements are taken with 15‐s cadence from an equatorial GPS station located at Bogota, Columbia (4.7110° N, 74.0721° W). The 2008–2017 data set, including the top five parameters estimated using the random forest, is used for training the machine learning models, and the 2018 data set is used for independent testing of the LSTM forecasting. The LSTM method as applied to forecast the TEC up to 5 h ahead, with 30‐min cadence. The results indicate that very good forecasts with low root mean square (RMS) error (high correlation) can be made in the near future and the RMS errors increase as we forecast further into the future. The data sources are satellite and ground based measurements characterizing the solar‐terrestrial environment.  more » « less
Award ID(s):
1933056
PAR ID:
10450805
Author(s) / Creator(s):
 ;  ;  ;  ;  ;  
Publisher / Repository:
DOI PREFIX: 10.1029
Date Published:
Journal Name:
Space Weather
Volume:
19
Issue:
6
ISSN:
1542-7390
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Photospheric magnetic field parameters are frequently used to analyze and predict solar events. Observation of these parameters over time, i.e., representing solar events by multivariate time-series (MVTS) data, can determine relationships between magnetic field states in active regions and extreme solar events, e.g., solar flares. We can improve our understanding of these events by selecting the most relevant parameters that give the highest predictive performance. In this study, we propose a two-step incremental feature selection method for MVTS data using a deep-learning model based on long short-term memory (LSTM) networks. First, each MVTS feature (magnetic field parameter) is evaluated individually by a univariate sequence classifier utilizing an LSTM network. Then, the top performing features are combined to produce input for an LSTM-based multivariate sequence classifier. Finally, we tested the discrimination ability of the selected features by training downstream classifiers, e.g., Minimally Random Convolutional Kernel Transform and support vector machine. We performed our experiments using a benchmark data set for flare prediction known as Space Weather Analytics for Solar Flares. We compared our proposed method with three other baseline feature selection methods and demonstrated that our method selects more discriminatory features compared to other methods. Due to the imbalanced nature of the data, primarily caused by the rarity of minority flare classes (e.g., the X and M classes), we used the true skill statistic as the evaluation metric. Finally, we reported the set of photospheric magnetic field parameters that give the highest discrimination performance in predicting flare classes. 
    more » « less
  2. Power grid operators rely on solar irradiance forecasts to manage uncertainty and variability associated with solar power. Meteorological factors such as cloud cover, wind direction, and wind speed affect irradiance and are associated with a high degree of variability and uncertainty. Statistical models fail to accurately capture the dependence between these factors and irradiance. In this paper, we introduce the idea of applying multivariate Gated Recurrent Units (GRU) to forecast Direct Normal Irradiance (DNI) hourly. The proposed GRU-based forecasting method is evaluated against traditional Long Short-Term Memory (LSTM) using historical irradiance data (i.e., weather variables that include cloud cover, wind direction, and wind speed) to forecast irradiance forecasting over intra-hour and inter-hour intervals. Our evaluation on one of the sites from Measurement and Instrumentation Data Center indicate that both GRU and LSTM improved DNI forecasting performance when evaluated under different conditions. Moreover, including wind direction and wind speed can have substantial improvement in the accuracy of DNI forecasts. Besides, the forecasting model can accurately forecast irradiance values over multiple forecasting horizons. 
    more » « less
  3. Abstract. Machine learning is quickly becoming a commonly used technique for wind speed and power forecasting. Many machine learning methods utilize exogenous variables as input features, but there remains the question of which atmospheric variables are most beneficial for forecasting, especially in handling non-linearities that lead to forecasting error. This question is addressed via creation of a hybrid model that utilizes an autoregressive integrated moving-average (ARIMA) model to make an initial wind speed forecast followed by a random forest model that attempts to predict the ARIMA forecasting error using knowledge of exogenous atmospheric variables.Variables conveying information about atmospheric stability and turbulence as well as inertial forcing are found to be useful in dealing with non-linear error prediction. Streamwise wind speed, time of day, turbulence intensity, turbulent heat flux, vertical velocity, and wind direction are found to be particularly useful when used in unison for hourly and 3 h timescales. The prediction accuracy of the developed ARIMA–random forest hybrid model is compared to that of the persistence and bias-corrected ARIMA models. The ARIMA–random forest model is shown to improve upon the latter commonly employed modeling methods, reducing hourly forecasting error by up to 5 % below that of the bias-corrected ARIMA model and achieving an R2 value of 0.84 with true wind speed. 
    more » « less
  4. Abstract This study evaluates the performance of deep learning approach in the prediction of the ionospheric total electron content (TEC) during magnetically quiet periods. Two deep learning techniques, long short‐term memory (LSTM) and convolutional LSTM (ConvLSTM), are employed to predict TEC values 24 hr ahead in the vicinity of the Korean Peninsula (26.5°–40°N, 121°–134.5°E). The LSTM method predicts TEC at a single point based on time series of data at that point, whereas the ConvLSTM method simultaneously predicts TEC values at multiple points using spatiotemporal distribution of TEC. Both the LSTM and ConvLSTM models are trained using the complete regional TEC maps reconstructed by applying the Deep Convolutional Generative Adversarial Network–Poisson Blending (DCGAN‐PB) method to observed TEC data. The training period spans from 2002 to 2018, and the model performance is evaluated using 2019 data. Our results show that the ConvLSTM method outperforms the LSTM method, generating more reliable TEC maps with smaller root mean square errors when compared to the ground truth (DCGAN‐PB TEC maps). This outcome indicates that deep learning models can improve the prediction accuracy of TEC at a specific point by taking into account spatial information of TEC. We conclude that ConvLSTM is a reliable and efficient approach for the prompt ionospheric prediction. 
    more » « less
  5. Abstract A novel method for real-time solar generation forecast using weather data, while exploiting both spatial and temporal structural dependencies is proposed. The network observed over time is projected to a lower-dimensional representation where a variety of weather measurements are used to train a structured regression model while weather forecast is used at the inference stage. Experiments were conducted at 288 locations in the San Antonio, TX area on obtained from the National Solar Radiation Database. The model predicts solar irradiance with a good accuracy (R2 0.91 for the summer, 0.85 for the winter, and 0.89 for the global model). The best accuracy was obtained by the Random Forest Regressor. Multiple experiments were conducted to characterize influence of missing data and different time horizons providing evidence that the new algorithm is robust for data missing not only completely at random but also when the mechanism is spatial, and temporal. 
    more » « less