skip to main content


Title: Towards coupling full-disk and active region-based flare prediction for operational space weather forecasting
Solar flare prediction is a central problem in space weather forecasting and has captivated the attention of a wide spectrum of researchers due to recent advances in both remote sensing as well as machine learning and deep learning approaches. The experimental findings based on both machine and deep learning models reveal significant performance improvements for task specific datasets. Along with building models, the practice of deploying such models to production environments under operational settings is a more complex and often time-consuming process which is often not addressed directly in research settings. We present a set of new heuristic approaches to train and deploy an operational solar flare prediction system for ≥M1.0-class flares with two prediction modes: full-disk and active region-based. In full-disk mode, predictions are performed on full-disk line-of-sight magnetograms using deep learning models whereas in active region-based models, predictions are issued for each active region individually using multivariate time series data instances. The outputs from individual active region forecasts and full-disk predictors are combined to a final full-disk prediction result with a meta-model. We utilized an equal weighted average ensemble of two base learners’ flare probabilities as our baseline meta learner and improved the capabilities of our two base learners by training a logistic regression model. The major findings of this study are: 1) We successfully coupled two heterogeneous flare prediction models trained with different datasets and model architecture to predict a full-disk flare probability for next 24 h, 2) Our proposed ensembling model, i.e., logistic regression, improves on the predictive performance of two base learners and the baseline meta learner measured in terms of two widely used metrics True Skill Statistic (TSS) and Heidke Skill Score (HSS), and 3) Our result analysis suggests that the logistic regression-based ensemble (Meta-FP) improves on the full-disk model (base learner) by ∼9% in terms TSS and ∼10% in terms of HSS. Similarly, it improves on the AR-based model (base learner) by ∼17% and ∼20% in terms of TSS and HSS respectively. Finally, when compared to the baseline meta model, it improves on TSS by ∼10% and HSS by ∼15%.  more » « less
Award ID(s):
1931555 2104004
NSF-PAR ID:
10401764
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
Frontiers in Astronomy and Space Sciences
Volume:
9
ISSN:
2296-987X
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Solar flares are transient space weather events that pose a significant threat to space and ground-based technological systems, making their precise and reliable prediction crucial for mitigating potential impacts. This paper contributes to the growing body of research on deep learning methods for solar flare prediction, primarily focusing on highly overlooked near-limb flares and utilizing the attribution methods to provide a post hoc qualitative explanation of the model’s predictions. We present a solar flare prediction model, which is trained using hourly full-disk line-of-sight magnetogram images and employs a binary prediction mode to forecast ≥M-class flares that may occur within the following 24-hour period. To address the class imbalance, we employ a fusion of data augmentation and class weighting techniques; and evaluate the overall performance of our model using the true skill statistic (TSS) and Heidke skill score (HSS). Moreover, we applied three attribution methods, namely Guided Gradient-weighted Class Activation Mapping, Integrated Gradients, and Deep Shapley Additive Explanations, to interpret and cross-validate our model’s predictions with the explanations. Our analysis revealed that full-disk prediction of solar flares aligns with characteristics related to active regions (ARs). In particular, the key findings of this study are: (1) our deep learning models achieved an average TSS∼0.51 and HSS∼0.35, and the results further demonstrate a competent capability to predict near-limb solar flares and (2) the qualitative analysis of the model’s explanation indicates that our model identifies and uses features associated with ARs in central and near-limb locations from full-disk magnetograms to make corresponding predictions. In other words, our models learn the shape and texture-based characteristics of flaring ARs even when they are at near-limb areas, which is a novel and critical capability that has significant implications for operational forecasting. 
    more » « less
  2. Lossio-Ventura J.A. ; Valverde-Rebaza J. ; Diaz E. ; Muñante D. ; Gavidia-Calderon C. ; Baria Valejo A.D. ; Alatrista-Salas H. (Ed.)
    The efforts in solar flare prediction have been engendered by the advancements in machine learning and deep learning methods. We present a new approach to flare prediction using full-disk compressed magnetogram images with Convolutional Neural Networks. We selected three prediction modes, among which two are binary for predicting the occurrence of ≥M1.0 and ≥C4.0 class flares and one is a multi-class mode for predicting the occurrence of more » « less
  3. null (Ed.)
    Propensity score methods account for selection bias in observational studies. However, the consistency of the propensity score estimators strongly depends on a correct specification of the propensity score model. Logistic regression and, with increasing popularity, machine learning tools are used to estimate propensity scores. We introduce a stacked generalization ensemble learning approach to improve propensity score estimation by fitting a meta learner on the predictions of a suitable set of diverse base learners. We perform a comprehensive Monte Carlo simulation study, implementing a broad range of scenarios that mimic characteristics of typical data sets in educational studies. The population average treatment effect is estimated using the propensity score in Inverse Probability of Treatment Weighting. Our proposed stacked ensembles, especially using gradient boosting machines as a meta learner trained on a set of 12 base learner predictions, led to superior reduction of bias compared to the current state-of-the-art in propensity score estimation. Further, our simulations imply that commonly used balance measures (averaged standardized absolute mean differences) might be misleading as propensity score model selection criteria. We apply our proposed model - which we call GBM-Stack - to assess the population average treatment effect of a Supplemental Instruction (SI) program in an introductory psychology (PSY 101) course at San Diego State University. Our analysis provides evidence that moving the whole population to SI attendance would on average lead to 1.69 times higher odds to pass the PSY 101 class compared to not offering SI, with a 95% bootstrap confidence interval of (1.31, 2.20). 
    more » « less
  4. Abstract

    In this paper we present several methods to identify precursors that show great promise for early predictions of solar flare events. A data preprocessing pipeline is built to extract useful data from multiple sources, Geostationary Operational Environmental Satellites and Solar Dynamics Observatory (SDO)/Helioseismic and Magnetic Imager (HMI), to prepare inputs for machine learning algorithms. Two classification models are presented: classification of flares from quiet times for active regions and classification of strong versus weak flare events. We adopt deep learning algorithms to capture both spatial and temporal information from HMI magnetogram data. Effective feature extraction and feature selection with raw magnetogram data using deep learning and statistical algorithms enable us to train classification models to achieve almost as good performance as using active region parameters provided in HMI/Space‐Weather HMI‐Active Region Patch (SHARP) data files. Case studies show a significant increase in the prediction score around 20 hr before strong solar flare events.

     
    more » « less
  5. Abstract

    We develop a mixed long short‐term memory (LSTM) regression model to predict the maximum solar flare intensity within a 24‐hr time window 0–24, 6–30, 12–36, and 24–48 hr ahead of time using 6, 12, 24, and 48 hr of data (predictors) for each Helioseismic and Magnetic Imager (HMI) Active Region Patch (HARP). The model makes use of (1) the Space‐Weather HMI Active Region Patch (SHARP) parameters as predictors and (2) the exact flare intensities instead of class labels recorded in the Geostationary Operational Environmental Satellites (GOES) data set, which serves as the source of the response variables. Compared to solar flare classification, the model offers us more detailed information about the exact maximum flux level, that is, intensity, for each occurrence of a flare. We also consider classification models built on top of the regression model and obtain better results in solar flare classifications as compared to Chen et al. (2019,https://doi.org/10.1029/2019SW002214). Our results suggest that the most efficient time period for predicting the solar activity is within 24 hr before the prediction time using the SHARP parameters and the LSTM model.

     
    more » « less