skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Award ID contains: 2001670

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Context. Machine-learning methods for predicting solar flares typically employ physics-based features that have been carefully cho- sen by experts in order to capture the salient features of the photospheric magnetic fields of the Sun. Aims. Though the sophistication and complexity of these models have grown over time, there has been little evolution in the choice of feature sets, or any systematic study of whether the additional model complexity leads to higher predictive skill. Methods. This study compares the relative prediction performance of four different machine-learning based flare prediction models with increasing degrees of complexity. It evaluates three different feature sets as input to each model: a “traditional” physics-based feature set, a novel “shape-based” feature set derived from topological data analysis (TDA) of the solar magnetic field, and a com- bination of these two sets. A systematic hyperparameter tuning framework is employed in order to assure fair comparisons of the models across different feature sets. Finally, principal component analysis is used to study the effects of dimensionality reduction on these feature sets. Results. It is shown that simpler models with fewer free parameters perform better than the more complicated models on the canonical 24-h flare forecasting problem. In other words, more complex machine-learning architectures do not necessarily guarantee better prediction performance. In addition, it is found that shape-based feature sets contain just as much useful information as physics-based feature sets for the purpose of flare prediction, and that the dimension of these feature sets – particularly the shape-based one – can be greatly reduced without impacting predictive accuracy. 
    more » « less
  2. Supervised Machine Learning (ML) models for solar flare prediction rely on accurate labels for a given input data set, commonly obtained from the GOES/XRS X-ray flare catalog. With increasing interest in utilizing ultraviolet (UV) and extreme ultraviolet (EUV) image data as input to these models, we seek to understand if flaring activity can be defined and quantified using EUV data alone. This would allow us to move away from the GOES single pixel measurement definition of flares and use the same data we use for flare prediction for label creation. In this work, we present a Solar Dynamics Observatory (SDO) Atmospheric Imaging Assembly (AIA)-based flare catalog covering flare of GOES X-ray magnitudes C, M and X from 2010 to 2017. We use active region (AR) cutouts of full disk AIA images to match the corresponding SDO/Helioseismic and Magnetic Imager (HMI) SHARPS (Space weather HMI Active Region Patches) that have been extensively used in ML flare prediction studies, thus allowing for labeling of AR number as well as flare magnitude and timing. Flare start, peak, and end times are defined using a peak-finding algorithm on AIA time series data obtained by summing the intensity across the AIA cutouts. An extremely randomized trees (ERT) regression model is used to map SDO/AIA flare magnitudes to GOES X-ray magnitude, achieving a low-variance regression. We find an accurate overlap on 85% of M/X flares between our resulting AIA catalog and the GOES flare catalog. However, we also discover a number of large flares unrecorded or mislabeled in the GOES catalog. 
    more » « less
  3. Scaling regions—intervals on a graph where the dependent variable depends linearly on the independent variable—abound in dynamical systems, notably in calculations of invariants like the correlation dimension or a Lyapunov exponent. In these applications, scaling regions are generally selected by hand, a process that is subjective and often challenging due to noise, algorithmic effects, and confirmation bias. In this paper, we propose an automated technique for extracting and characterizing such regions. Starting with a two-dimensional plot—e.g., the values of the correlation integral, calculated using the Grassberger–Procaccia algorithm over a range of scales—we create an ensemble of intervals by considering all possible combinations of end points, generating a distribution of slopes from least squares fits weighted by the length of the fitting line and the inverse square of the fit error. The mode of this distribution gives an estimate of the slope of the scaling region (if it exists). The end points of the intervals that correspond to the mode provide an estimate for the extent of that region. When there is no scaling region, the distributions will be wide and the resulting error estimates for the slope will be large. We demonstrate this method for computations of dimension and Lyapunov exponent for several dynamical systems and show that it can be useful in selecting values for the parameters in time-delay reconstructions. 
    more » « less
  4. null (Ed.)
  5. null (Ed.)
    Current operational forecasts of solar eruptions are made by human experts using a combination of qualitative shape-based classification systems and historical data about flaring frequencies. In the past decade, there has been a great deal of interest in crafting machine-learning (ML) flare-prediction methods to extract underlying patterns from a training set – e.g. a set of solar magnetogram images, each characterized by features derived from the magnetic field and labeled as to whether it was an eruption precursor. These patterns, captured by various methods (neural nets, support vector machines, etc.), can then be used to classify new images. A major challenge with any ML method is the featurization of the data: pre-processing the raw images to extract higher-level properties, such as characteristics of the magnetic field, that can streamline the training and use of these methods. It is key to choose features that are informative, from the standpoint of the task at hand. To date, the majority of ML-based solar eruption methods have used physics-based magnetic and electric field features such as the total unsigned magnetic flux, the gradients of the fields, the vertical current density, etc. In this paper, we extend the relevant feature set to include characteristics of the magnetic field that are based purely on the geometry and topology of 2D magnetogram images and show that this improves the prediction accuracy of a neural-net based flare-prediction method. 
    more » « less