skip to main content

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 11:00 PM ET on Friday, December 13 until 2:00 AM ET on Saturday, December 14 due to maintenance. We apologize for the inconvenience.


Title: Predicting the formation of fractionally doped perovskite oxides by a function-confined machine learning method
Abstract Fractionally doped perovskites oxides (FDPOs) have demonstrated ubiquitous applications such as energy conversion, storage and harvesting, catalysis, sensor, superconductor, ferroelectric, piezoelectric, magnetic, and luminescence. Hence, an accurate, cost-effective, and easy-to-use methodology to discover new compositions is much needed. Here, we developed a function-confined machine learning methodology to discover new FDPOs with high prediction accuracy from limited experimental data. By focusing on a specific application, namely solar thermochemical hydrogen production, we collected 632 training data and defined 21 desirable features. Our gradient boosting classifier model achieved a high prediction accuracy of 95.4% and a high F1 score of 0.921. Furthermore, when verified on additional 36 experimental data from existing literature, the model showed a prediction accuracy of 94.4%. With the help of this machine learning approach, we identified and synthesized 11 new FDPO compositions, 7 of which are relevant for solar thermochemical hydrogen production. We believe this confined machine learning methodology can be used to discover, from limited data, FDPOs with other specific application purposes.  more » « less
Award ID(s):
2025541
PAR ID:
10356865
Author(s) / Creator(s):
; ; ; ; ;
Date Published:
Journal Name:
Communications Materials
Volume:
3
Issue:
1
ISSN:
2662-4443
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. This paper describes a generalizable framework for creating context-aware wall-time prediction models for HPC applications. This framework: (a) cost-effectively generates comprehensive application-specific training data, (b) provides an application-independent machine learning pipeline that trains different regression models over the training datasets, and (c) establishes context-aware selection criteria for model selection. We explain how most of the training data can be generated on commodity or contention-free cyberinfrastructure and how the predictive models can be scaled to the production environment with the help of a limited number of resource-intensive generated runs (we show almost seven-fold cost reductions along with better performance). Our machine learning pipeline does feature transformation, and dimensionality reduction, then reduces sampling bias induced by data imbalance. Our context-aware model selection algorithm chooses the most appropriate regression model for a given target application that reduces the number of underpredictions while minimizing overestimation errors. Index Terms—AI4CI, Data Science Workflow, Custom ML Models, HPC, Data Generation, Scheduling, Resource Estimations 
    more » « less
  2. Abstract

    Solar energetic particle (SEP) events, originating from solar flares and Coronal Mass Ejections, present significant hazards to space exploration and technology on Earth. Accurate prediction of these high‐energy events is essential for safeguarding astronauts, spacecraft, and electronic systems. In this study, we conduct an in‐depth investigation into the application of multimodal data fusion techniques for the prediction of high‐energy SEP events, particularly ∼100 MeV events. Our research utilizes six machine learning (ML) models, each finely tuned for time series analysis, including Univariate Time Series (UTS), Image‐based model (Image), Univariate Feature Concatenation (UFC), Univariate Deep Concatenation (UDC), Univariate Deep Merge (UDM), and Univariate Score Concatenation (USC). By combining time series proton flux data with solar X‐ray images, we exploit complementary insights into the underlying solar phenomena responsible for SEP events. Rigorous evaluation metrics, including accuracy, F1‐score, and other established measures, are applied, along withK‐fold cross‐validation, to ensure the robustness and generalization of our models. Additionally, we explore the influence of observation window sizes on classification accuracy.

     
    more » « less
  3. Abstract Solar flare prediction plays an important role in understanding and forecasting space weather. The main goal of the Helioseismic and Magnetic Imager (HMI), one of the instruments on NASA’s Solar Dynamics Observatory, is to study the origin of solar variability and characterize the Sun’s magnetic activity. HMI provides continuous full-disk observations of the solar vector magnetic field with high cadence data that lead to reliable predictive capability; yet, solar flare prediction effort utilizing these data is still limited. In this paper, we present a machine-learning-as-a-service (MLaaS) framework, called DeepSun, for predicting solar flares on the web based on HMI’s data products. Specifically, we construct training data by utilizing the physical parameters provided by the Space-weather HMI Active Region Patch (SHARP) and categorize solar flares into four classes, namely B, C, M and X, according to the X-ray flare catalogs available at the National Centers for Environmental Information (NCEI). Thus, the solar flare prediction problem at hand is essentially a multi-class (i.e., four-class) classification problem. The DeepSun system employs several machine learning algorithms to tackle this multi-class prediction problem and provides an application programming interface (API) for remote programming users. To our knowledge, DeepSun is the first MLaaS tool capable of predicting solar flares through the internet. 
    more » « less
  4. Non-stoichiometric perovskite oxides have been studied as a new family of redox oxides for solar thermochemical hydrogen (STCH) production owing to their favourable thermodynamic properties. However, conventional perovskite oxides suffer from limited phase stability and kinetic properties, and poor cyclability. Here, we report a strategy of introducing A-site multi-principal-component mixing to develop a high-entropy perovskite oxide, (La1/6Pr1/6Nd1/6Gd1/6Sr1/6Ba1/6)MnO3 (LPNGSB_Mn), which shows desirable thermodynamic and kinetics properties as well as excellent phase stability and cycling durability. LPNGSB_Mn exhibits enhanced hydrogen production (∼77.5 mmol/mol-oxide) compared to (La2/3Sr1/3)MnO3 (∼53.5 mmol / mol-oxide) in a short 1 hour redox duration and high STCH and phase stability for 50 cycles. LPNGSB_Mn possesses a moderate enthalpy of reduction (252.51–296.32 kJ / mol-oxide), a high entropy of reduction (126.95–168.85 J / mol-oxide), and fast surface oxygen exchange kinetics. All A-site cations do not show observable valence changes during the reduction and oxidation processes. This research preliminarily explores the use of one A-site high-entropy perovskite oxide for STCH. 
    more » « less
  5. Machine learning (ML) accelerates the exploration of material properties and their links to the structure of the underlying molecules. In previous work [Shi et al. ACS Applied Materials & Interfaces 2022, 14, 37161−37169.], ML models were applied to predict the adhesive free energy of polymer–surface interactions with high accuracy from the knowledge of the sequence data, demonstrating successes in inverse-design of polymer sequence for known surface compositions. While the method was shown to be successful in designing polymers for a known surface, extensive data sets were needed for each specific surface in order to train the surrogate models. Ideally, one should be able to infer information about similar surfaces without having to regenerate a full complement of adhesion data for each new case. In the current work, we demonstrate a transfer learning (TL) technique using a deep neural network to improve the accuracy of ML models trained on small data sets by pretraining on a larger database from a related system and fine-tuning the weights of all layers with a small amount of additional data. The shared knowledge from the pretrained model facilitates the prediction accuracy significantly on small data sets. We also explore the limits of database size on accuracy and the optimal tuning of network architecture and parameters for our learning tasks. While applied to a relatively simple coarse-grained (CG) polymer model, the general lessons of this study apply to detailed modeling studies and the broader problems of inverse materials design. 
    more » « less