NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

An End-to-end Ensemble Machine Learning Approach for Predicting High-impact Solar Energetic Particle Events Using Multimodal Data

https://doi.org/10.3847/1538-4365/adb1c4

Hosseinzadeh, Pouya; Filali_Boubrahimi, Soukaina; Hamdi, Shah Muhammad (March 2025, The Astrophysical Journal Supplement Series)

Abstract Solar energetic particle (SEP) events, in particular high-energy-range SEP events, pose significant risks to space missions, astronauts, and technological infrastructure. Accurate prediction of these high-impact events is crucial for mitigating potential hazards. In this study, we present an end-to-end ensemble machine learning (ML) framework for the prediction of high-impact ∼100 MeV SEP events. Our approach leverages diverse data modalities sourced from the Solar and Heliospheric Observatory and the Geostationary Operational Environmental Satellite integrating extracted active region polygons from solar extreme ultraviolet (EUV) imagery, time-series proton flux measurements, sunspot activity data, and detailed active region characteristics. To quantify the predictive contribution of each data modality (e.g., EUV or time series), we independently evaluate them using a range of ML models to assess their performance in forecasting SEP events. Finally, to enhance the SEP predictive performance, we train an ensemble learning model that combines all the models trained on individual data modalities, leveraging the strengths of each data modality. Our proposed ensemble approach shows promising performance, achieving a recall of 0.80 and 0.75 in balanced and imbalanced settings, respectively, underscoring the effectiveness of multimodal data integration for robust SEP event prediction and enhanced forecasting capabilities.
more » « less
Free, publicly-accessible full text available March 17, 2026
Time-Series Feature Selection for Solar Flare Forecasting

https://doi.org/10.3390/universe10090373

Velanki, Yagnashree; Hosseinzadeh, Pouya; Boubrahimi, Soukaina Filali; Hamdi, Shah Muhammad (September 2024, Universe)

Solar flares are significant occurrences in solar physics, impacting space weather and terrestrial technologies. Accurate classification of solar flares is essential for predicting space weather and minimizing potential disruptions to communication, navigation, and power systems. This study addresses the challenge of selecting the most relevant features from multivariate time-series data, specifically focusing on solar flares. We employ methods such as Mutual Information (MI), Minimum Redundancy Maximum Relevance (mRMR), and Euclidean Distance to identify key features for classification. Recognizing the performance variability of different feature selection techniques, we introduce an ensemble approach to compute feature weights. By combining outputs from multiple methods, our ensemble method provides a more comprehensive understanding of the importance of features. Our results show that the ensemble approach significantly improves classification performance, achieving values 0.15 higher in True Skill Statistic (TSS) values compared to individual feature selection methods. Additionally, our method offers valuable insights into the underlying physical processes of solar flares, leading to more effective space weather forecasting and enhanced mitigation strategies for communication, navigation, and power system disruptions.
more » « less
Full Text Available
Enhancing Monthly Streamflow Prediction Using Meteorological Factors and Machine Learning Models in the Upper Colorado River Basin

https://doi.org/10.3390/hydrology11050066

Thota, Saichand; Nassar, Ayman; Filali_Boubrahimi, Soukaina; Hamdi, Shah Muhammad; Hosseinzadeh, Pouya (May 2024, Hydrology)

Streamflow prediction is crucial for planning future developments and safety measures along river basins, especially in the face of changing climate patterns. In this study, we utilized monthly streamflow data from the United States Bureau of Reclamation and meteorological data (snow water equivalent, temperature, and precipitation) from the various weather monitoring stations of the Snow Telemetry Network within the Upper Colorado River Basin to forecast monthly streamflow at Lees Ferry, a specific location along the Colorado River in the basin. Four machine learning models—Random Forest Regression, Long short-term memory, Gated Recurrent Unit, and Seasonal AutoRegresive Integrated Moving Average—were trained using 30 years of monthly data (1991–2020), split into 80% for training (1991–2014) and 20% for testing (2015–2020). Initially, only historical streamflow data were used for predictions, followed by including meteorological factors to assess their impact on streamflow. Subsequently, sequence analysis was conducted to explore various input-output sequence window combinations. We then evaluated the influence of each factor on streamflow by testing all possible combinations to identify the optimal feature combination for prediction. Our results indicate that the Random Forest Regression model consistently outperformed others, especially after integrating all meteorological factors with historical streamflow data. The best performance was achieved with a 24-month look-back period to predict 12 months of streamflow, yielding a Root Mean Square Error of 2.25 and R-squared (R2) of 0.80. Finally, to assess model generalizability, we tested the best model at other locations—Greenwood Springs (Colorado River), Maybell (Yampa River), and Archuleta (San Juan) in the basin.
more » « less
Full Text Available
Spatiotemporal Data Augmentation of MODIS‐Landsat Water Bodies Using Adversarial Networks

https://doi.org/10.1029/2023WR036342

Filali_Boubrahimi, Soukaina; Neema, Ashit; Nassar, Ayman; Hosseinzadeh, Pouya; Hamdi, Shah Muhammad (March 2024, Water Resources Research)

Abstract With increasing demands for precise water resource management, there is a growing need for advanced techniques in mapping water bodies. The currently deployed satellites provide complementary data that are either of high spatial or high temporal resolutions. As a result, there is a clear trade‐off between space and time when considering a single data source. For the efficient monitoring of multiple environmental resources, various Earth science applications need data at high spatial and temporal resolutions. To address this need, many data fusion methods have been described in the literature, that rely on combining data snapshots from multiple sources. Traditional methods face limitations due to sensitivity to atmospheric disturbances and other environmental factors, resulting in noise, outliers, and missing data. This paper introduces Hydrological Generative Adversarial Network (Hydro‐GAN), a novel machine learning‐based method that utilizes modified GANs to enhance boundary accuracy when mapping low‐resolution MODIS data to high‐resolution Landsat‐8 images. We propose a new non‐saturating loss function for the Hydro‐GAN generator, which maximizes the log of discriminator probabilities to promote stable updates and aid convergence. By focusing on reducing squared differences between real and synthetic images, our approach enhances training stability and overall performance. We specifically focus on mapping water bodies using MODIS and Landsat‐8 imagery due to their relevance in water resource management tasks. Our experimental results demonstrate the effectiveness of Hydro‐GAN in generating high‐resolution water body maps, outperforming traditional methods in terms of boundary accuracy and overall quality.
more » « less
Full Text Available
Improving Solar Energetic Particle Event Prediction through Multivariate Time Series Data Augmentation

https://doi.org/10.3847/1538-4365/ad1de0

Hosseinzadeh, Pouya; Filali_Boubrahimi, Soukaina; Hamdi, Shah_Muhammad (February 2024, The Astrophysical Journal Supplement Series)

Abstract Solar energetic particles (SEPs) are associated with extreme solar events that can cause major damage to space- and ground-based life and infrastructure. High-intensity SEP events, particularly ∼100 MeV SEP events, can pose severe health risks for astronauts owing to radiation exposure and affect Earth’s orbiting satellites (e.g., Landsat and the International Space Station). A major challenge in the SEP event prediction task is the lack of adequate SEP data because of the rarity of these events. In this work, we aim to improve the prediction of ∼30, ∼60, and ∼100 MeV SEP events by synthetically increasing the number of SEP samples. We explore the use of a univariate and multivariate time series of proton flux data as input to machine-learning-based prediction methods, such as time series forest (TSF). Our study covers solar cycles 22, 23, and 24. Our findings show that using data augmentation methods, such as the synthetic minority oversampling technique, remarkably increases the accuracy and F1-score of the classifiers used in this research, especially for TSF, where the average accuracy increased by 20%, reaching around 90% accuracy in the ∼100 MeV SEP prediction task. We also achieved higher prediction accuracy when using the multivariate time series data of the proton flux. Finally, we build a pipeline framework for our best-performing model, TSF, and provide a comprehensive hierarchical classification of the ∼100, ∼60, and ∼30 MeV and non-SEP prediction scenarios.
more » « less
METFORC: Classification with Meta-Learning and Multimodal Stratified Time Series Forest

https://doi.org/10.1109/ICMLA58977.2023.00188

Hosseinzadeh, Pouya; Bahri, Omar; Li, Peiyu; Boubrahimi, Soukaina Filali; Hamdi, Shah Muhammad (December 2023, 2023 International Conference on Machine Learning and Applications (ICMLA))
Adversarial Attack Driven Data Augmentation for Time Series Classification

https://doi.org/10.1109/ICMLA58977.2023.00096

Li, Peiyu; Hosseinzadeh, Pouya; Bahri, Omar; Boubrahimi, Soukaina Filali; Hamdi, Shah Muhammad (December 2023, 2023 International Conference on Machine Learning and Applications (ICMLA))
Shapelet-Preserving Bootstrapping For Time Series Data Augmentation

https://doi.org/10.1109/ICMLA58977.2023.00069

Bahri, Omar; Li, Peiyu; Hosseinzadeh, Pouya; Boubrahimi, Soukaïna Filali; Hamdi, Shah Muhammad (December 2023, IEEE)
Toward Enhanced Prediction of High‐Impact Solar Energetic Particle Events Using Multimodal Time Series Data Fusion Models

https://doi.org/10.1029/2024SW003982

Hosseinzadeh, Pouya; Filali_Boubrahimi, Soukaina; Hamdi, Shah_Muhammad (June 2024, Space Weather)

Abstract Solar energetic particle (SEP) events, originating from solar flares and Coronal Mass Ejections, present significant hazards to space exploration and technology on Earth. Accurate prediction of these high‐energy events is essential for safeguarding astronauts, spacecraft, and electronic systems. In this study, we conduct an in‐depth investigation into the application of multimodal data fusion techniques for the prediction of high‐energy SEP events, particularly ∼100 MeV events. Our research utilizes six machine learning (ML) models, each finely tuned for time series analysis, including Univariate Time Series (UTS), Image‐based model (Image), Univariate Feature Concatenation (UFC), Univariate Deep Concatenation (UDC), Univariate Deep Merge (UDM), and Univariate Score Concatenation (USC). By combining time series proton flux data with solar X‐ray images, we exploit complementary insights into the underlying solar phenomena responsible for SEP events. Rigorous evaluation metrics, including accuracy, F1‐score, and other established measures, are applied, along withK‐fold cross‐validation, to ensure the robustness and generalization of our models. Additionally, we explore the influence of observation window sizes on classification accuracy.
more » « less
ML-Based Streamflow Prediction in the Upper Colorado River Basin Using Climate Variables Time Series Data

https://doi.org/10.3390/hydrology10020029

Hosseinzadeh, Pouya; Nassar, Ayman; Boubrahimi, Soukaina Filali; Hamdi, Shah Muhammad (February 2023, Hydrology)

Streamflow prediction plays a vital role in water resources planning in order to understand the dramatic change of climatic and hydrologic variables over different time scales. In this study, we used machine learning (ML)-based prediction models, including Random Forest Regression (RFR), Long Short-Term Memory (LSTM), Seasonal Auto- Regressive Integrated Moving Average (SARIMA), and Facebook Prophet (PROPHET) to predict 24 months ahead of natural streamflow at the Lees Ferry site located at the bottom part of the Upper Colorado River Basin (UCRB) of the US. Firstly, we used only historic streamflow data to predict 24 months ahead. Secondly, we considered meteorological components such as temperature and precipitation as additional features. We tested the models on a monthly test dataset spanning 6 years, where 24-month predictions were repeated 50 times to ensure the consistency of the results. Moreover, we performed a sensitivity analysis to identify our best-performing model. Later, we analyzed the effects of considering different span window sizes on the quality of predictions made by our best model. Finally, we applied our best-performing model, RFR, on two more rivers in different states in the UCRB to test the model’s generalizability. We evaluated the performance of the predictive models using multiple evaluation measures. The predictions in multivariate time-series models were found to be more accurate, with RMSE less than 0.84 mm per month, R-squared more than 0.8, and MAPE less than 0.25. Therefore, we conclude that the temperature and precipitation of the UCRB increases the accuracy of the predictions. Ultimately, we found that multivariate RFR performs the best among four models and is generalizable to other rivers in the UCRB.
more » « less
Full Text Available

« Prev Next »

Search for: All records