In real-world materials research, machine learning (ML) models are usually expected to predict and discover novel exceptional materials that deviate from the known materials. It is thus a pressing question to provide an objective evaluation of ML model performances in property prediction of out-of-distribution (OOD) materials that are different from the training set. Traditional performance evaluation of materials property prediction models through the random splitting of the dataset frequently results in artificially high-performance assessments due to the inherent redundancy of typical material datasets. Here we present a comprehensive benchmark study of structure-based graph neural networks (GNNs) for extrapolative OOD materials property prediction. We formulate five different categories of OOD ML problems for three benchmark datasets from the MatBench study. Our extensive experiments show that current state-of-the-art GNN algorithms significantly underperform for the OOD property prediction tasks on average compared to their baselines in the MatBench study, demonstrating a crucial generalization gap in realistic material prediction tasks. We further examine the latent physical spaces of these GNN models and identify the sources of CGCNN, ALIGNN, and DeeperGATGNN’s significantly more robust OOD performance than those of the current best models in the MatBench study (coGN and coNGN) as a case study for the perovskites dataset, and provide insights to improve their performance.
Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Abstract -
Thermoelectric materials harvest waste heat and convert it into reusable electricity. Thermoelectrics are also widely used in inverse ways such as refrigerators and cooling electronics. However, most popular and known thermoelectric materials to date were proposed and found by intuition, mostly through experiments. Unfortunately, it is extremely time and resource consuming to synthesize and measure the thermoelectric properties through trial-and-error experiments. Here, we develop a convolutional neural network (CNN) classification model that utilizes the fused orbital field matrix and composition descriptors to screen a large pool of materials to discover new thermoelectric candidates with power factor higher than 10 μW/cm K2. The model used our own data generated by high-throughput density functional theory calculations coupled with ab initio scattering and transport package to obtain electronic transport properties without assuming constant relaxation time of electrons, which ensures more reliable electronic transport properties calculations than previous studies. The classification model was also compared to some traditional machine learning algorithms such as gradient boosting and random forest. We deployed the classification model on 3465 cubic dynamically stable structures with non-zero bandgap screened from Open Quantum Materials Database. We identified many high-performance thermoelectric materials with ZT > 1 or close to 1 across a wide temperature range from 300 to 700 K and for both n- and p-type doping with different doping concentrations. Moreover, our feature importance and maximal information coefficient analysis demonstrates two previously unreported material descriptors, namely, mean melting temperature and low average deviation of electronegativity, that are strongly correlated with power factor and thus provide a new route for quickly screening potential thermoelectrics with high success rate. Our deep CNN model with fused orbital field matrix and composition descriptors is very promising for screening high power factor thermoelectrics from large-scale hypothetical structures.
Free, publicly-accessible full text available June 1, 2025 -
The discovery of advanced thermal materials with exceptional phonon properties drives technological advancements, impacting innovations from electronics to superconductors. Understanding the intricate relationship between composition, structure, and phonon thermal transport properties is crucial for speeding up such discovery. Exploring innovative materials involves navigating vast design spaces and considering chemical and structural factors on multiple scales and modalities. Artificial intelligence (AI) is transforming science and engineering and poised to transform discovery and innovation. This era offers a unique opportunity to establish a new paradigm for the discovery of advanced materials by leveraging databases, simulations, and accumulated knowledge, venturing into experimental frontiers, and incorporating cutting-edge AI technologies. In this perspective, first, the general approach of density functional theory (DFT) coupled with phonon Boltzmann transport equation (BTE) for predicting comprehensive phonon properties will be reviewed. Then, to circumvent the extremely computationally demanding DFT + BTE approach, some early studies and progress of deploying AI/machine learning (ML) models to phonon thermal transport in the context of structure–phonon property relationship prediction will be presented, and their limitations will also be discussed. Finally, a summary of current challenges and an outlook of future trends will be given. Further development of incorporating AI/ML algorithms for phonon thermal transport could range from phonon database construction to universal machine learning potential training, to inverse design of materials with target phonon properties and to extend ML models beyond traditional phonons.
Free, publicly-accessible full text available May 7, 2025 -
Prediction of crystal structures with desirable material properties is a grand challenge in materials research. We deployed graph theory assisted structure searcher and combined with universal machine learning potentials to accelerate the process.
Free, publicly-accessible full text available April 2, 2025 -
Crystal structure prediction using neural network potential and age-fitness Pareto genetic algorithm
While crystal structure prediction (CSP) remains a longstanding challenge, we introduce ParetoCSP, a novel algorithm for CSP, which combines a multi-objective genetic algorithm (GA) with a neural network inter-atomic potential model to find energetically optimal crystal structures given chemical compositions. We enhance the updated multi-objective GA (NSGA-III) by incorporating the genotypic age as an independent optimization criterion and employ the M3GNet universal inter-atomic potential to guide the GA search. Compared to GN-OA, a state-of-the-art neural potential-based CSP algorithm, ParetoCSP demonstrated significantly better predictive capabilities, outperforming by a factor of $$ 2.562 $$ across $$ 55 $$ diverse benchmark structures, as evaluated by seven performance metrics. Trajectory analysis of the traversed structures of all algorithms shows that ParetoCSP generated more valid structures than other algorithms, which helped guide the GA to search more effectively for the optimal structures. Our implementation code is available at https://github.com/sadmanomee/ParetoCSP .
Free, publicly-accessible full text available March 2, 2025 -
PbAuGa and CsKNa possess record low lattice thermal conductivity which is even comparable to that of air. The loosely bonded Au and Cs atoms in PbAuGa and CsKNa respectively act as intrinsic rattlers and thus induce strong phonon anharmonicity.
Free, publicly-accessible full text available November 16, 2024 -
Using dual machine learning models, we identified 3218 inorganic crystals with ultralow lattice thermal conductivity (LTC), which will be of great interest for technologically important applications such as thermal insulators and thermoelectrics.
Free, publicly-accessible full text available November 14, 2024