skip to main content


Search for: All records

Creators/Authors contains: "Ahmadzadeh, Azim"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. MAGFiLO is a dataset of manually annotated solar filaments from H-Alpha observations captured by the Global Oscillation Network Group (GONG). This dataset includes over ten thousand annotated filaments, spanning the years 2011 through 2022. Each annotation details one filament's segmentation, minimum bounding box, spine, and magnetic field chirality. MAGFiLO is the first dataset of its size, enabling advanced deep learning models to identify filaments and their features with unprecedented precision. It also provides a testbed for solar physicists interested in large-scale analysis of filaments. 
    more » « less
  2. The class-imbalance issue is intrinsic to many real-world machine learning tasks, particularly to the rare-event classification problems. Although the impact and treatment of imbalanced data is widely known, the magnitude of a metric’s sensitivity to class imbalance has attracted little attention. As a result, often the sensitive metrics are dismissed while their sensitivity may only be marginal. In this paper, we introduce an intuitive evaluation framework that quantifies metrics’ sensitivity to the class imbalance. Moreover, we reveal an interesting fact that there is a logarithmic behavior in metrics’ sensitivity meaning that the higher imbalance ratios are associated with the lower sensitivity of metrics. Our framework builds an intuitive understanding of the class-imbalance impact on metrics. We believe this can help avoid many common mistakes, specially the less-emphasized and incorrect assumption that all metrics’ quantities are comparable under different class-imbalance ratios. 
    more » « less
  3. Aiming to assess the progress and current challenges on the formidable problem of the prediction of solar energetic events since the COSPAR/ International Living With a Star (ILWS) Roadmap paper of Schrijver et al. (2015) , we attempt an overview of the current status of global research efforts. By solar energetic events we refer to flares, coronal mass ejections (CMEs), and solar energetic particle (SEP) events. The emphasis, therefore, is on the prediction methods of solar flares and eruptions, as well as their associated SEP manifestations. This work complements the COSPAR International Space Weather Action Teams (ISWAT) review paper on the understanding of solar eruptions by Linton et al. (2023) (hereafter, ISWAT review papers are conventionally referred to as ’Cluster’ papers, given the ISWAT structure). Understanding solar flares and eruptions as instabilities occurring above the nominal background of solar activity is a core solar physics problem. We show that effectively predicting them stands on two pillars: physics and statistics. With statistical methods appearing at an increasing pace over the last 40 years, the last two decades have brought the critical realization that data science needs to be involved, as well, as volumes of diverse ground- and space-based data give rise to a Big Data landscape that cannot be handled, let alone processed, with conventional statistics. Dimensionality reduction in immense parameter spaces with the dual aim of both interpreting and forecasting solar energetic events has brought artificial intelligence (AI) methodologies, in variants of machine and deep learning, developed particularly for tackling Big Data problems. With interdisciplinarity firmly present, we outline an envisioned framework on which statistical and AI methodologies should be verified in terms of performance and validated against each other. We emphasize that a homogenized and streamlined method validation is another open challenge. The performance of the plethora of methods is typically far from perfect, with physical reasons to blame, besides practical shortcomings: imperfect data, data gaps and a lack of multiple, and meaningful, vantage points of solar observations. We briefly discuss these issues, too, that shape our desired short- and long-term objectives for an efficient future predictive capability. A central aim of this article is to trigger meaningful, targeted discussions that will compel the community to adopt standards for performance verification and validation, which could be maintained and enriched by institutions such as NASA’s Community Coordinated Modeling Center (CCMC) and the community-driven COSPAR/ISWAT initiative. 
    more » « less
  4. Abstract We present a case study of solar flare forecasting by means of metadata feature time series, by treating it as a prominent class-imbalance and temporally coherent problem. Taking full advantage of pre-flare time series in solar active regions is made possible via the Space Weather Analytics for Solar Flares (SWAN-SF) benchmark data set, a partitioned collection of multivariate time series of active region properties comprising 4075 regions and spanning over 9 yr of the Solar Dynamics Observatory period of operations. We showcase the general concept of temporal coherence triggered by the demand of continuity in time series forecasting and show that lack of proper understanding of this effect may spuriously enhance models’ performance. We further address another well-known challenge in rare-event prediction, namely, the class-imbalance issue. The SWAN-SF is an appropriate data set for this, with a 60:1 imbalance ratio for GOES M- and X-class flares and an 800:1 imbalance ratio for X-class flares against flare-quiet instances. We revisit the main remedies for these challenges and present several experiments to illustrate the exact impact that each of these remedies may have on performance. Moreover, we acknowledge that some basic data manipulation tasks such as data normalization and cross validation may also impact the performance; we discuss these problems as well. In this framework we also review the primary advantages and disadvantages of using true skill statistic and Heidke skill score, two widely used performance verification metrics for the flare-forecasting task. In conclusion, we show and advocate for the benefits of time series versus point-in-time forecasting, provided that the above challenges are measurably and quantitatively addressed. 
    more » « less