The melting temperature is important for materials design because of its relationship with thermal stability, synthesis, and processing conditions. Current empirical and computational melting point estimation techniques are limited in scope, computational feasibility, or interpretability. We report the development of a machine learning methodology for predicting melting temperatures of binary ionic solid materials. We evaluated different machine-learning models trained on a dataset of the melting points of 476 non-metallic crystalline binary compounds using materials embeddings constructed from elemental properties and density-functional theory calculations as model inputs. A direct supervised-learning approach yields a mean absolute error of around 180 K but suffers from low interpretability. We find that the fidelity of predictions can further be improved by introducing an additional unsupervised-learning step that first classifies the materials before the melting-point regression. Not only does this two-step model exhibit improved accuracy, but the approach also provides a level of interpretability with insights into feature importance and different types of melting that depend on the specific atomic bonding inside a material. Motivated by this finding, we used a symbolic learning approach to find interpretable physical models for the melting temperature, which recovered the best-performing features from both prior models and provided additional interpretability.
more »
« less
Melting temperature prediction using a graph neural network model: From ancient minerals to new materials
The melting point is a fundamental property that is time-consuming to measure or compute, thus hindering high-throughput analyses of melting relations and phase diagrams over large sets of candidate compounds. To address this, we build a machine learning model, trained on a database of ∼10,000 compounds, that can predict the melting temperature in a fraction of a second. The model, made publicly available online, features graph neural network and residual neural network architectures. We demonstrate the model’s usefulness in diverse applications. For the purpose of materials design and discovery, we show that it can quickly discover novel multicomponent materials with high melting points. These predictions are confirmed by density functional theory calculations and experimentally validated. In an application to planetary science and geology, we employ the model to analyze the melting temperatures of ∼4,800 minerals to uncover correlations relevant to the study of mineral evolution.
more »
« less
- PAR ID:
- 10356782
- Date Published:
- Journal Name:
- Proceedings of the National Academy of Sciences
- Volume:
- 119
- Issue:
- 36
- ISSN:
- 0027-8424
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract Ultrahigh temperature ceramics (UHTCs) have melting points above 3000°C and outstanding strength at high temperatures, thus making them apposite structural materials for high‐temperature applications. Di‐borides, nitride, and carbide compounds—processed via various techniques—have been extensively studied and used in the manufacture of UHTCs. Current analytical models, based on our current but incomplete understanding of the theory, are unable to produce a priori predictions of mechanical properties of UHTCs based on their mixture designs and processing parameters. As a result, researchers have to rely on experiments—which are often costly and time‐consuming—to understand composition–structure–performance links in UHTCs. This study employs machine learning (ML) models (i.e., random forest and artificial neural network models) to predict Young's modulus, flexural strength, and fracture toughness of UHTCs in relation to a wide range of mixture designs, processing parameters, and testing conditions. Outcomes demonstrate that adequately trained ML models can yield reliable predictions, a priori, of the three aforesaid mechanical properties. The prediction performance on Young's modulus is superior to flexural strength and fracture toughness. Next, the ML model with the best prediction performance is utilized to evaluate and rank the impacts of input variables on Young's modulus. Finally, on the basis of such classification of consequential and inconsequential input variables, this study develops an easy‐to‐use, closed‐form analytical model to predict Young's modulus of UHTCs. Overall, this study highlights the ability of data‐driven numerical models to complement, or even replace, time‐consuming experiments, thereby accelerating the development of UHTCs.more » « less
-
Gas-particle partitioning of secondary organic aerosols is impacted by particle phase state and viscosity, which can be inferred from the glass transition temperature ( T g ) of the constituting organic compounds. Several parametrizations were developed to predict T g of organic compounds based on molecular properties and elemental composition, but they are subject to relatively large uncertainties as they do not account for molecular structure and functionality. Here we develop a new T g prediction method powered by machine learning and “molecular embeddings”, which are unique numerical representations of chemical compounds that retain information on their structure, inter atomic connectivity and functionality. We have trained multiple state-of-the-art machine learning models on databases of experimental T g of organic compounds and their corresponding molecular embeddings. The best prediction model is the tgBoost model built with an Extreme Gradient Boosting (XGBoost) regressor trained via a nested cross-validation method, reproducing experimental data very well with a mean absolute error of 18.3 K. It can also quantify the influence of number and location of functional groups on the T g of organic molecules, while accounting for atom connectivity and predicting different T g for compositional isomers. The tgBoost model suggests the following trend for sensitivity of T g to functional group addition: –COOH (carboxylic acid) > –C(O)OR (ester) ≈ –OH (alcohol) > –C(O)R (ketone) ≈ –COR (ether) ≈ –C(O)H (aldehyde). We also developed a model to predict the melting point ( T m ) of organic compounds by training a deep neural network on a large dataset of experimental T m . The model performs reasonably well against the available dataset with a mean absolute error of 31.0 K. These new machine learning powered models can be applied to field and laboratory measurements as well as atmospheric aerosol models to predict the T g and T m of SOA compounds for evaluation of the phase state and viscosity of SOA.more » « less
-
The process of developing new compounds and materials is increasingly driven by computational modeling and simulation, which allow us to characterize candidates before pursuing them in the laboratory. One of the non-trivial properties of interest for organic materials is their packing in the bulk, which is highly dependent on their molecular structure. By controlling the latter, we can realize materials with a desired density (as well as other target properties). Molecular dynamics simulations are a popular and reasonably accurate way to compute the bulk density of molecules, however, since these calculations are computationally intensive, they are not a practically viable option for high-throughput screening studies that assess material candidates on a massive scale. In this work, we employ machine learning to develop a data-derived prediction model that is an alternative to physics-based simulations, and we utilize it for the hyperscreening of 1.5 million small organic molecules as well as to gain insights into the relationship between structural makeup and packing density. We also use this study to analyze the learning curve of the employed neural network approach and gain empirical data on the dependence of model performance and training data size, which will inform future investigations.more » « less
-
ifferent mechanisms are used for the discovery of materials. These include creating a material by trial-and-error process without knowing its properties. Other methods are based on computational simulations or mathematical and statistical approaches, such as Density Functional Theory (DFT). A well-known strategy combines elements to predict their properties and selects a set of those with the properties of interest. Carrying out exhaustive calculations to predict the properties of these found compounds may require a high computational cost. Therefore, there is a need to create methods for identifying materials with a desired set of properties while reducing the search space and, consequently, the computational cost. In this work, we present a genetic algorithm that can find a higher percentage of compounds with specific properties than state-of-the-art methods, such as those based on combinatorial screening. Both methods are compared in the search for ternary compounds in an unconstrained space, using a Deep Neural Network (DNN) to predict properties such as formation enthalpy, band gap, and stability; we will focus on formation enthalpy. As a result, we provide a genetic algorithm capable of finding up to 60% more compounds with atypical values of properties, using DNNs for their prediction.more » « less
An official website of the United States government

