skip to main content


The NSF Public Access Repository (PAR) system and access will be unavailable from 11:00 PM ET on Thursday, February 13 until 2:00 AM ET on Friday, February 14 due to maintenance. We apologize for the inconvenience.

Title: Bitmap Filtered Line of Sight HMI Active Region Patches with Augmentations
Dataset Description This dataset consists of processed Line-of-Sight (LoS) magnetogram images of Active Regions (ARs) from the Helioseismic and Magnetic Imager (HMI) onboard the Solar Dynamics Observatory (SDO). The images are derived from the Space-Weather HMI Active Region Patches (SHARP) data product definitive series and cover the period from May 2010 to 2018, sampled hourly. Dataset Contents: Processed Magnetogram Images: Each image represents a cropped and standardized view of an AR patch, extracted and adjusted from the original magnetograms. These images have been filtered and normalized to a size of 512×512 pixels. Processing Steps: Cropping: Magnetograms are cropped using bitmaps that define the region of interest within the AR patches. Regions smaller than 70 pixels in width are excluded. Flux Adjustment: Magnetic flux values are capped at ±256 G, with values within ±25 G set to 0 to minimize noise. Standardization: Patches are resized to 512×512 pixels using zero-padding for smaller patches or a 512×512 kernel to select regions with the maximum total unsigned flux (USFLUX) for larger patches. Normalization: Final images are scaled to fit within the range of 0-255. Data Dictionary: harp_N1_N2: These tar files contains folders where the AR patches with harp number N1 to N2 are included. complete_hourly_dataset.csv: This includes the list of hourly sampled magnetograms along with their associated goes flare class, assuming a 24 hour forecast horizon. augmentations: Five different augmentations of AR patches corresponding to GOES flare classes greater than C, assuming a 24 hour forecast horizon are listed as 5 different tar files. Look for: horizontal flip, vertical_flip, add noise, polarity change, and gaussian blur.  more » « less
Award ID(s):
Author(s) / Creator(s):
; ;
Publisher / Repository:
Harvard Dataverse
Date Published:
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    In this paper we present several methods to identify precursors that show great promise for early predictions of solar flare events. A data preprocessing pipeline is built to extract useful data from multiple sources, Geostationary Operational Environmental Satellites and Solar Dynamics Observatory (SDO)/Helioseismic and Magnetic Imager (HMI), to prepare inputs for machine learning algorithms. Two classification models are presented: classification of flares from quiet times for active regions and classification of strong versus weak flare events. We adopt deep learning algorithms to capture both spatial and temporal information from HMI magnetogram data. Effective feature extraction and feature selection with raw magnetogram data using deep learning and statistical algorithms enable us to train classification models to achieve almost as good performance as using active region parameters provided in HMI/Space‐Weather HMI‐Active Region Patch (SHARP) data files. Case studies show a significant increase in the prediction score around 20 hr before strong solar flare events.

    more » « less
  2. Supervised Machine Learning (ML) models for solar flare prediction rely on accurate labels for a given input data set, commonly obtained from the GOES/XRS X-ray flare catalog. With increasing interest in utilizing ultraviolet (UV) and extreme ultraviolet (EUV) image data as input to these models, we seek to understand if flaring activity can be defined and quantified using EUV data alone. This would allow us to move away from the GOES single pixel measurement definition of flares and use the same data we use for flare prediction for label creation. In this work, we present a Solar Dynamics Observatory (SDO) Atmospheric Imaging Assembly (AIA)-based flare catalog covering flare of GOES X-ray magnitudes C, M and X from 2010 to 2017. We use active region (AR) cutouts of full disk AIA images to match the corresponding SDO/Helioseismic and Magnetic Imager (HMI) SHARPS (Space weather HMI Active Region Patches) that have been extensively used in ML flare prediction studies, thus allowing for labeling of AR number as well as flare magnitude and timing. Flare start, peak, and end times are defined using a peak-finding algorithm on AIA time series data obtained by summing the intensity across the AIA cutouts. An extremely randomized trees (ERT) regression model is used to map SDO/AIA flare magnitudes to GOES X-ray magnitude, achieving a low-variance regression. We find an accurate overlap on 85% of M/X flares between our resulting AIA catalog and the GOES flare catalog. However, we also discover a number of large flares unrecorded or mislabeled in the GOES catalog.

    more » « less
  3. Magnetic polarity inversion lines (PILs) in solar active regions are key to triggering flares and eruptions. Recently, engineered PIL features have been used for predicting solar eruptions. Derived from the original PIL dataset, using line-of-sight (LoS) magnetograms provided by the Solar Dynamics Observatory's (SDO) Helioseismic and Magnetic Imager (HMI) Active Region Patches (HARPs), we provide a publicly available comprehensive dataset in a supervised format, where each instance includes a raster of Polarity Inversion Lines (PILs), one of the polarity convex hull, and a multivariate time-series of properties related to PILs. Using SDO-GOES integrated flares historical data covering May 2010 to January 2019, we have assigned each of the instances their corresponding class of flare, FQ, C, M or X. By integrating these diverse data modalities, our approach aims to improve the accuracy of solar flare predictions. Initial findings suggest that the multimodal approach can uncover new patterns and relationships, potentially leading to breakthroughs in predictive accuracy and more effective mitigation strategies against the impacts of solar activities. 
    more » « less
  4. Using non-linear force free field (NLFFF) extrapolation, 3D magnetic fields were modeled from the 12-min cadence Solar Dynamics Observatory Helioseismic and Magnetic Imager (HMI) photospheric vector magnetograms, spanning a time period of 1 hour before through 1 hour after the start of 18 X-class and 12 M-class solar flares. Several magnetic field parameters were calculated from the modeled fields directly, as well as from the power spectrum of surface maps generated by summing the fields along the vertical axis, for two different regions: areas with photospheric |Bz|≥ 300 G (active region—AR) and areas above the photosphere with the magnitude of the non-potential field (BNP) greater than three standard deviations above|BNP|̄of the AR field and either the unsigned twist number |Tw| ≥ 1 turn or the shear angle Ψ ≥ 80° (non-potential region—NPR). Superposed epoch (SPE) plots of the magnetic field parameters were analyzed to investigate the evolution of the 3D solar field during the solar flare events and discern consistent trends across all solar flare events in the dataset, as well as across subsets of flare events categorized by their magnetic and sunspot classifications. The relationship between different flare properties and the magnetic field parameters was quantitatively described by the Spearman ranking correlation coefficient, rs. The parameters that showed the most consistent and discernable trends among the flare events, particularly for the hour leading up to the eruption, were the total unsigned fluxϕ), free magnetic energy (EFree), total unsigned magnetic twist (τTot), and total unsigned free magnetic twist (ρTot). Strong (|rs| ∈ [0.6, 0.8)) to very strong (|rs| ∈ [0.8, 1.0]) correlations were found between the magnetic field parameters and the following flare properties: peak X-ray flux, duration, rise time, decay time, impulsiveness, and integrated flux; the strongest correlation coefficient calculated for each flare property was 0.62, 0.85, 0.73, 0.82, −0.81, and 0.82, respectively.

    more » « less
  5. Solar flares are transient space weather events that pose a significant threat to space and ground-based technological systems, making their precise and reliable prediction crucial for mitigating potential impacts. This paper contributes to the growing body of research on deep learning methods for solar flare prediction, primarily focusing on highly overlooked near-limb flares and utilizing the attribution methods to provide a post hoc qualitative explanation of the model’s predictions. We present a solar flare prediction model, which is trained using hourly full-disk line-of-sight magnetogram images and employs a binary prediction mode to forecast ≥M-class flares that may occur within the following 24-hour period. To address the class imbalance, we employ a fusion of data augmentation and class weighting techniques; and evaluate the overall performance of our model using the true skill statistic (TSS) and Heidke skill score (HSS). Moreover, we applied three attribution methods, namely Guided Gradient-weighted Class Activation Mapping, Integrated Gradients, and Deep Shapley Additive Explanations, to interpret and cross-validate our model’s predictions with the explanations. Our analysis revealed that full-disk prediction of solar flares aligns with characteristics related to active regions (ARs). In particular, the key findings of this study are: (1) our deep learning models achieved an average TSS∼0.51 and HSS∼0.35, and the results further demonstrate a competent capability to predict near-limb solar flares and (2) the qualitative analysis of the model’s explanation indicates that our model identifies and uses features associated with ARs in central and near-limb locations from full-disk magnetograms to make corresponding predictions. In other words, our models learn the shape and texture-based characteristics of flaring ARs even when they are at near-limb areas, which is a novel and critical capability that has significant implications for operational forecasting. 
    more » « less