Various machine learning models have been used to predict the properties of polycrystalline materials, but none of them directly consider the physical interactions among neighboring grains despite such microscopic interactions critically determining macroscopic material properties. Here, we develop a graph neural network (GNN) model for obtaining an embedding of polycrystalline microstructure which incorporates not only the physical features of individual grains but also their interactions. The embedding is then linked to the target property using a feed-forward neural network. Using the magnetostriction of polycrystalline Tb0.3Dy0.7Fe2alloys as an example, we show that a single GNN model with fixed network architecture and hyperparameters allows for a low prediction error of ~10% over a group of remarkably different microstructures as well as quantifying the importance of each feature in each grain of a microstructure to its magnetostriction. Such a microstructure-graph-based GNN model, therefore, enables an accurate and interpretable prediction of the properties of polycrystalline materials.
- Award ID(s):
- 1905775
- PAR ID:
- 10291048
- Date Published:
- Journal Name:
- Physical Chemistry Chemical Physics
- Volume:
- 22
- Issue:
- 32
- ISSN:
- 1463-9076
- Page Range / eLocation ID:
- 18141 to 18148
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
Abstract -
Abstract In real-world materials research, machine learning (ML) models are usually expected to predict and discover novel exceptional materials that deviate from the known materials. It is thus a pressing question to provide an objective evaluation of ML model performances in property prediction of out-of-distribution (OOD) materials that are different from the training set. Traditional performance evaluation of materials property prediction models through the random splitting of the dataset frequently results in artificially high-performance assessments due to the inherent redundancy of typical material datasets. Here we present a comprehensive benchmark study of structure-based graph neural networks (GNNs) for extrapolative OOD materials property prediction. We formulate five different categories of OOD ML problems for three benchmark datasets from the MatBench study. Our extensive experiments show that current state-of-the-art GNN algorithms significantly underperform for the OOD property prediction tasks on average compared to their baselines in the MatBench study, demonstrating a crucial generalization gap in realistic material prediction tasks. We further examine the latent physical spaces of these GNN models and identify the sources of CGCNN, ALIGNN, and DeeperGATGNN’s significantly more robust OOD performance than those of the current best models in the MatBench study (coGN and coNGN) as a case study for the perovskites dataset, and provide insights to improve their performance.
-
null (Ed.)Physical interactions of proteins play key functional roles in many important cellular processes. To understand molecular mechanisms of such functions, it is crucial to determine the structure of protein complexes. To complement experimental approaches, which usually take a considerable amount of time and resources, various computational methods have been developed for predicting the structures of protein complexes. In computational modeling, one of the challenges is to identify near-native structures from a large pool of generated models. Here, we developed a deep learning–based approach named Graph Neural Network–based DOcking decoy eValuation scorE (GNN-DOVE). To evaluate a protein docking model, GNN-DOVE extracts the interface area and represents it as a graph. The chemical properties of atoms and the inter-atom distances are used as features of nodes and edges in the graph, respectively. GNN-DOVE was trained, validated, and tested on docking models in the Dockground database and further tested on a combined dataset of Dockground and ZDOCK benchmark as well as a CAPRI scoring dataset. GNN-DOVE performed better than existing methods, including DOVE, which is our previous development that uses a convolutional neural network on voxelized structure models.more » « less
-
Graph Neural Networks (GNNs) extend the success of neural networks to graph-structured data by accounting for their intrinsic geometry. While extensive research has been done on developing GNN models with superior performance according to a collection of graph representation learning benchmarks, it is currently not well understood what aspects of a given model are probed by them. For example, to what extent do they test the ability of a model to leverage graph structure vs. node features? Here, we develop a principled approach to taxonomize benchmarking datasets according to a \textdollar \textit{sensitivity profile}\textdollar that is based on how much GNN performance changes due to a collection of graph perturbations. Our data-driven analysis provides a deeper understanding of which benchmarking data characteristics are leveraged by GNNs. Consequently, our taxonomy can aid in selection and development of adequate graph benchmarks, and better informed evaluation of future GNN methods. Finally, our approach is designed to be extendable to multiple graph prediction task types and future datasets.more » « less
-
Graph Neural Networks (GNNs) are powerful machine learning prediction models on graph-structured data. However, GNNs lack rigorous uncertainty estimates, limiting their reliable deployment in settings where the cost of errors is significant. We propose conformalized GNN (CF-GNN), extending conformal prediction (CP) to graph-based models for guaranteed uncertainty estimates. Given an entity in the graph, CF-GNN produces a prediction set/interval that provably contains the true label with pre-defined coverage probability (e.g. 90%). We establish a permutation invariance condition that enables the validity of CP on graph data and provide an exact characterization of the test-time coverage. Besides valid coverage, it is crucial to reduce the prediction set size/interval length for practical use. We observe a key connection between non-conformity scores and network structures, which motivates us to develop a topology-aware output correction model that learns to update the prediction and produces more efficient prediction sets/intervals. Extensive experiments show that CF-GNN achieves any pre-defined target marginal coverage while significantly reducing the prediction set/interval size by up to 74% over the baselines. It also empirically achieves satisfactory conditional coverage over various raw and network features.more » « less