- Award ID(s):
- 2025541
- PAR ID:
- 10356865
- Date Published:
- Journal Name:
- Communications Materials
- Volume:
- 3
- Issue:
- 1
- ISSN:
- 2662-4443
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
This paper describes a generalizable framework for creating context-aware wall-time prediction models for HPC applications. This framework: (a) cost-effectively generates comprehensive application-specific training data, (b) provides an application-independent machine learning pipeline that trains different regression models over the training datasets, and (c) establishes context-aware selection criteria for model selection. We explain how most of the training data can be generated on commodity or contention-free cyberinfrastructure and how the predictive models can be scaled to the production environment with the help of a limited number of resource-intensive generated runs (we show almost seven-fold cost reductions along with better performance). Our machine learning pipeline does feature transformation, and dimensionality reduction, then reduces sampling bias induced by data imbalance. Our context-aware model selection algorithm chooses the most appropriate regression model for a given target application that reduces the number of underpredictions while minimizing overestimation errors. Index Terms—AI4CI, Data Science Workflow, Custom ML Models, HPC, Data Generation, Scheduling, Resource Estimationsmore » « less
-
Abstract Solar energetic particle (SEP) events, originating from solar flares and Coronal Mass Ejections, present significant hazards to space exploration and technology on Earth. Accurate prediction of these high‐energy events is essential for safeguarding astronauts, spacecraft, and electronic systems. In this study, we conduct an in‐depth investigation into the application of multimodal data fusion techniques for the prediction of high‐energy SEP events, particularly ∼100 MeV events. Our research utilizes six machine learning (ML) models, each finely tuned for time series analysis, including Univariate Time Series (UTS), Image‐based model (Image), Univariate Feature Concatenation (UFC), Univariate Deep Concatenation (UDC), Univariate Deep Merge (UDM), and Univariate Score Concatenation (USC). By combining time series proton flux data with solar X‐ray images, we exploit complementary insights into the underlying solar phenomena responsible for SEP events. Rigorous evaluation metrics, including accuracy, F1‐score, and other established measures, are applied, along with
K ‐fold cross‐validation, to ensure the robustness and generalization of our models. Additionally, we explore the influence of observation window sizes on classification accuracy. -
Abstract Solar flare prediction plays an important role in understanding and forecasting space weather. The main goal of the Helioseismic and Magnetic Imager (HMI), one of the instruments on NASA’s Solar Dynamics Observatory, is to study the origin of solar variability and characterize the Sun’s magnetic activity. HMI provides continuous full-disk observations of the solar vector magnetic field with high cadence data that lead to reliable predictive capability; yet, solar flare prediction effort utilizing these data is still limited. In this paper, we present a machine-learning-as-a-service (MLaaS) framework, called DeepSun, for predicting solar flares on the web based on HMI’s data products. Specifically, we construct training data by utilizing the physical parameters provided by the Space-weather HMI Active Region Patch (SHARP) and categorize solar flares into four classes, namely B, C, M and X, according to the X-ray flare catalogs available at the National Centers for Environmental Information (NCEI). Thus, the solar flare prediction problem at hand is essentially a multi-class (i.e., four-class) classification problem. The DeepSun system employs several machine learning algorithms to tackle this multi-class prediction problem and provides an application programming interface (API) for remote programming users. To our knowledge, DeepSun is the first MLaaS tool capable of predicting solar flares through the internet.more » « less
-
Non-stoichiometric perovskite oxides have been studied as a new family of redox oxides for solar thermochemical hydrogen (STCH) production owing to their favourable thermodynamic properties. However, conventional perovskite oxides suffer from limited phase stability and kinetic properties, and poor cyclability. Here, we report a strategy of introducing A-site multi-principal-component mixing to develop a high-entropy perovskite oxide, (La1/6Pr1/6Nd1/6Gd1/6Sr1/6Ba1/6)MnO3 (LPNGSB_Mn), which shows desirable thermodynamic and kinetics properties as well as excellent phase stability and cycling durability. LPNGSB_Mn exhibits enhanced hydrogen production (∼77.5 mmol/mol-oxide) compared to (La2/3Sr1/3)MnO3 (∼53.5 mmol / mol-oxide) in a short 1 hour redox duration and high STCH and phase stability for 50 cycles. LPNGSB_Mn possesses a moderate enthalpy of reduction (252.51–296.32 kJ / mol-oxide), a high entropy of reduction (126.95–168.85 J / mol-oxide), and fast surface oxygen exchange kinetics. All A-site cations do not show observable valence changes during the reduction and oxidation processes. This research preliminarily explores the use of one A-site high-entropy perovskite oxide for STCH.more » « less
-
Machine learning (ML) accelerates the exploration of material properties and their links to the structure of the underlying molecules. In previous work [Shi et al. ACS Applied Materials & Interfaces 2022, 14, 37161−37169.], ML models were applied to predict the adhesive free energy of polymer–surface interactions with high accuracy from the knowledge of the sequence data, demonstrating successes in inverse-design of polymer sequence for known surface compositions. While the method was shown to be successful in designing polymers for a known surface, extensive data sets were needed for each specific surface in order to train the surrogate models. Ideally, one should be able to infer information about similar surfaces without having to regenerate a full complement of adhesion data for each new case. In the current work, we demonstrate a transfer learning (TL) technique using a deep neural network to improve the accuracy of ML models trained on small data sets by pretraining on a larger database from a related system and fine-tuning the weights of all layers with a small amount of additional data. The shared knowledge from the pretrained model facilitates the prediction accuracy significantly on small data sets. We also explore the limits of database size on accuracy and the optimal tuning of network architecture and parameters for our learning tasks. While applied to a relatively simple coarse-grained (CG) polymer model, the general lessons of this study apply to detailed modeling studies and the broader problems of inverse materials design.more » « less