skip to main content


Search for: All records

Creators/Authors contains: "Aydin, Berkay"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Solar flares are transient space weather events that pose a significant threat to space and ground-based technological systems, making their precise and reliable prediction crucial for mitigating potential impacts. This paper contributes to the growing body of research on deep learning methods for solar flare prediction, primarily focusing on highly overlooked near-limb flares and utilizing the attribution methods to provide a post hoc qualitative explanation of the model’s predictions. We present a solar flare prediction model, which is trained using hourly full-disk line-of-sight magnetogram images and employs a binary prediction mode to forecast ≥M-class flares that may occur within the following 24-hour period. To address the class imbalance, we employ a fusion of data augmentation and class weighting techniques; and evaluate the overall performance of our model using the true skill statistic (TSS) and Heidke skill score (HSS). Moreover, we applied three attribution methods, namely Guided Gradient-weighted Class Activation Mapping, Integrated Gradients, and Deep Shapley Additive Explanations, to interpret and cross-validate our model’s predictions with the explanations. Our analysis revealed that full-disk prediction of solar flares aligns with characteristics related to active regions (ARs). In particular, the key findings of this study are: (1) our deep learning models achieved an average TSS∼0.51 and HSS∼0.35, and the results further demonstrate a competent capability to predict near-limb solar flares and (2) the qualitative analysis of the model’s explanation indicates that our model identifies and uses features associated with ARs in central and near-limb locations from full-disk magnetograms to make corresponding predictions. In other words, our models learn the shape and texture-based characteristics of flaring ARs even when they are at near-limb areas, which is a novel and critical capability that has significant implications for operational forecasting. 
    more » « less
  2. Solar flare prediction is a central problem in space weather forecasting. Existing solar flare prediction tools are mainly dependent on the GOES classification system, and models commonly use a proxy of maximum (peak) X-ray flux measurement over a particular prediction window to label instances. However, the background X-ray flux dramatically fluctuates over a solar cycle and often misleads both flare detection and flare prediction models during solar minimum, leading to an increase in false alarms. We aim to enhance the accuracy of flare prediction methods by introducing novel labeling regimes that integrate relative increases and cumulative measurements over prediction windows. Our results show that the data-driven labels can offer more precise prediction capabilities and complement the existing efforts. 
    more » « less
  3. Over the past two decades, machine learning and deep learning techniques for forecasting solar flares have generated great impact due to their ability to learn from a high dimensional data space. However, lack of high quality data from flaring phenomena becomes a constraining factor for such tasks. One of the methods to tackle this complex problem is utilizing trained classifiers with multivariate time series of magnetic field parameters. In this work, we compare the exceedingly popular multivariate time series classifiers applying deep learning techniques with commonly used machine learning classifiers (i.e., SVM). We intend to explore the role of data augmentation on time series oriented flare prediction techniques, specifically the deep learning-based ones. We utilize four time series data augmentation techniques and couple them with selected multivariate time series classifiers to understand how each of them affects the outcome. In the end, we show that the deep learning algorithms as well as augmentation techniques improve our classifiers performance. The resulting classifiers’ performance after augmentation outplayed the traditional flare forecasting techniques. 
    more » « less
  4. Abstract

    Magnetic polarity inversion lines (PILs) detected in solar active regions have long been recognized as arguably the most essential feature for triggering instabilities such as flares and eruptive events (i.e., eruptive flares and coronal mass ejections). In recent years, efforts have been focused on using features engineered from PILs for solar eruption prediction. However, PIL rasters and metadata are often generated as by-products and are not accessible for public use, which limits their utilization in data-intensive space weather analytics applications. We introduce a large-scale publicly available PIL data set covering practically the entire solar cycle 24 for applying to various space weather forecasting and analytics tasks. The data set is created using both radial magnetic field (B_r) and line-of-sight (B_LoS) magnetograms from the Solar Dynamics Observatory’s Helioseismic and Magnetic Imager Active Region Patches (HARP) that involve 4090 HARP series ranging from 2010 May to 2019 March. This data set includes three PIL-related binary masks of rasters: the actual PILs as per the spatial analysis of the magnetograms, the region of polarity inversion, and the convex hull of PILs, along with time-series-structured metadata extracted from these masks. We also provide a preliminary exploratory analysis of selected features aiming to correlate time series of feature metadata and eruptive activity originating from active regions. We envision that this comprehensive PIL data set will complement existing data sets used for space weather forecasting and benefit research in related areas, specifically in better understanding the PIL structure, evolution, and role in eruptions.

     
    more » « less
  5. Solar flare prediction is a central problem in space weather forecasting and has captivated the attention of a wide spectrum of researchers due to recent advances in both remote sensing as well as machine learning and deep learning approaches. The experimental findings based on both machine and deep learning models reveal significant performance improvements for task specific datasets. Along with building models, the practice of deploying such models to production environments under operational settings is a more complex and often time-consuming process which is often not addressed directly in research settings. We present a set of new heuristic approaches to train and deploy an operational solar flare prediction system for ≥M1.0-class flares with two prediction modes: full-disk and active region-based. In full-disk mode, predictions are performed on full-disk line-of-sight magnetograms using deep learning models whereas in active region-based models, predictions are issued for each active region individually using multivariate time series data instances. The outputs from individual active region forecasts and full-disk predictors are combined to a final full-disk prediction result with a meta-model. We utilized an equal weighted average ensemble of two base learners’ flare probabilities as our baseline meta learner and improved the capabilities of our two base learners by training a logistic regression model. The major findings of this study are: 1) We successfully coupled two heterogeneous flare prediction models trained with different datasets and model architecture to predict a full-disk flare probability for next 24 h, 2) Our proposed ensembling model, i.e., logistic regression, improves on the predictive performance of two base learners and the baseline meta learner measured in terms of two widely used metrics True Skill Statistic (TSS) and Heidke Skill Score (HSS), and 3) Our result analysis suggests that the logistic regression-based ensemble (Meta-FP) improves on the full-disk model (base learner) by ∼9% in terms TSS and ∼10% in terms of HSS. Similarly, it improves on the AR-based model (base learner) by ∼17% and ∼20% in terms of TSS and HSS respectively. Finally, when compared to the baseline meta model, it improves on TSS by ∼10% and HSS by ∼15%. 
    more » « less
  6. For integrating CME data to their solar sources, we perform a confidence-based scoring process that involves spatial and temporal data integration. For each GOES >C1.0 flare, we identify the likely CME candidate(s) with a temporal search using the start and peak times of flares and first detection time of CMEs. In this step, we check if the CMEs’ first detection time is between 30 minutes before the flare’s start time and 60 minutes after the flare’s peak time. Then, for each potential CME candidate, if any, we generate a confidence score between 1 to 5 (from lowest to highest) based on checking four additional criteria where each criteria gives an extra confidence point for the association: 1. Determine a potential one-to-one mapping between a given flare and a CME, in the case of only a single CME that satisfies the temporal search. 2. Check if the flare’s principal angle is in the same solar-disk quadrant as the CME’s principal angle, which is assumed as a 8 degree margin for boundary conditions. 3. Check if the difference between the flare’s principal angle and the CME’s principal angle (i.e., difference angle) is less than the CME’s observed width. 4. Check if the difference angle is less than 60 degrees threshold where such threshold is designated for wide CMEs (i.e., Halo and partial Halo) whose width almost always fulfill the width-based criterion in the previous criteria and aims to provide a more strict level of confidence. At the end of the integration procedure, CMEs are associated with flares using the maximum confidence score and therefore are connected to their most likely solar source. If there are multiple CMEs with the same high score, only those with the lowest difference angle between flare and the CME will be connected. Due to the fact that this process is not manually verified, we expect to generate certain ‘noisy’ data points. However, for overall model building, these data points should have a minimal statistical significance and hence a minimal impact. 
    more » « less
  7. Magnetic polarity inversion lines (PILs) detected in solar active regions have long been recognized as arguably the most essential feature for triggering the instabilities such as flares and eruptive events (i.e., eruptive flares and coronal mass ejections). In recent years, efforts have been focused on using features engineered from PILs for solar eruption prediction. However, PIL rasters and metadata are often generated as byproducts and are not accessible for public use, which limits their utilization in data-intensive space weather analytics applications. We introduce a large-scale publicly available PIL dataset covering practically the entire solar cycle 24 for applying to various space weather forecasting and analytics tasks. The dataset is created using line-of-sight (LoS) magnetograms from the Solar Dynamics Observatory's (SDO) Helioseismic and Magnetic Imager (HMI) Active Region Patches (HARPs) that involves 4,090 HARP series ranging from May 2010 to March 2019. This dataset includes three PIL-related binary masks of rasters: the actual PILs as per the spatial analysis of the magnetograms, the region of polarity inversion (RoPI), and the convex hull of PILs (convex closure of the set of detected PILs), along with time series structured metadata extracted from these masks. 
    more » « less
  8. Abstract We present a case study of solar flare forecasting by means of metadata feature time series, by treating it as a prominent class-imbalance and temporally coherent problem. Taking full advantage of pre-flare time series in solar active regions is made possible via the Space Weather Analytics for Solar Flares (SWAN-SF) benchmark data set, a partitioned collection of multivariate time series of active region properties comprising 4075 regions and spanning over 9 yr of the Solar Dynamics Observatory period of operations. We showcase the general concept of temporal coherence triggered by the demand of continuity in time series forecasting and show that lack of proper understanding of this effect may spuriously enhance models’ performance. We further address another well-known challenge in rare-event prediction, namely, the class-imbalance issue. The SWAN-SF is an appropriate data set for this, with a 60:1 imbalance ratio for GOES M- and X-class flares and an 800:1 imbalance ratio for X-class flares against flare-quiet instances. We revisit the main remedies for these challenges and present several experiments to illustrate the exact impact that each of these remedies may have on performance. Moreover, we acknowledge that some basic data manipulation tasks such as data normalization and cross validation may also impact the performance; we discuss these problems as well. In this framework we also review the primary advantages and disadvantages of using true skill statistic and Heidke skill score, two widely used performance verification metrics for the flare-forecasting task. In conclusion, we show and advocate for the benefits of time series versus point-in-time forecasting, provided that the above challenges are measurably and quantitatively addressed. 
    more » « less