skip to main content


Title: Predicting glass structure by physics-informed machine learning
Abstract

Machine learning (ML) is emerging as a powerful tool to predict the properties of materials, including glasses. Informing ML models with knowledge of how glass composition affects short-range atomic structure has the potential to enhance the ability of composition-property models to extrapolate accurately outside of their training sets. Here, we introduce an approach wherein statistical mechanics informs a ML model that can predict the non-linear composition-structure relations in oxide glasses. This combined model offers an improved prediction compared to models relying solely on statistical physics or machine learning individually. Specifically, we show that the combined model accurately both interpolates and extrapolates the structure of Na2O–SiO2glasses. Importantly, the model is able to extrapolate predictions outside its training set, which is evidenced by the fact that it is able to predict the structure of a glass series that was kept fully hidden from the model during its training.

 
more » « less
Award ID(s):
1826420 1944510 1928538
NSF-PAR ID:
10370667
Author(s) / Creator(s):
; ; ; ;
Publisher / Repository:
Nature Publishing Group
Date Published:
Journal Name:
npj Computational Materials
Volume:
8
Issue:
1
ISSN:
2057-3960
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Borates and borosilicates are potential candidates for the design and development of glass formulations with important industrial and technological applications. A major challenge that retards the pace of development of borate/borosilicate based glasses using predictive modeling is the lack of reliable computational models to predict the structure‐property relationships in these glasses over a wide compositional space. A major hindrance in this pursuit has been the complexity of boron‐oxygen bonding due to which it has been difficult to develop adequate B–O interatomic potentials. In this article, we have evaluated the performance of three B–O interatomic potential models recently developed by Bauchy et al [J.Non‐Cryst. Solids, 2018, 498, 294–304], Du et al [J. Am. Ceram. Soc.https://doi.org/10.1111/jace.16082] and Edèn et al [Phys. Chem. Chem. Phys., 2018, 20, 8192–8209] aiming to reproduce the short‐to‐medium range structures of sodium borosilicate glasses in the system 25 Na2OxB2O3(75 − x) SiO2(x = 0‐75 mol%). To evaluate the different force fields, we have computed at the density functional theory level the NMR parameters of11B,23Na, and29Si of the models generated with the three potentials and the simulated MAS NMR spectra compared with the experimental counterparts. It was observed that the rigid ionic models proposed by Bauchy and Du can both reliably reproduce the partitioning between BO3and BO4species of the investigated glasses, along with the local environment around sodium in the glass structure. However, they do not accurately reproduce the second coordination sphere of silicon ions and the Si–O–T (T = Si, B) and B‐O‐T distribution angles in the investigated compositional space which strongly affect the NMR parameters and final spectral shape. On the other hand, the core‐shell parameterization model proposed by Edén underestimates the fraction of BO4species of the glass with composition 25Na2O 18.4B2O356.6SiO2but can accurately reproduce the shape of the11B and29Si MAS‐NMR spectra of the glasses investigations due to the narrower B–O–T and Si‐O‐T bond angle distributions. Finally, the effect of the number of boron atoms (also distinguishing the BO3and BO4units) in the second coordination sphere of the network former cations on the NMR parameters have been evaluated.

     
    more » « less
  2. Abstract

    Machine learning (ML) regression methods are promising tools to develop models predicting the properties of materials by learning from existing databases. However, although ML models are usually good at interpolating data, they often do not offer reliable extrapolations and can violate the laws of physics. Here, to address the limitations of traditional ML, we introduce a “topology-informed ML” paradigm—wherein some features of the network topology (rather than traditional descriptors) are used as fingerprint for ML models—and apply this method to predict the forward (stage I) dissolution rate of a series of silicate glasses. We demonstrate that relying on a topological description of the atomic network (i) increases the accuracy of the predictions, (ii) enhances the simplicity and interpretability of the predictive models, (iii) reduces the need for large training sets, and (iv) improves the ability of the models to extrapolate predictions far from their training sets. As such, topology-informed ML can overcome the limitations facing traditional ML (e.g., accuracy vs. simplicity tradeoff) and offers a promising route to predict the properties of materials in a robust fashion.

     
    more » « less
  3. ABSTRACT

    We develop a machine-learning (ML) algorithm that generates high-resolution thermal Sunyaev–Zeldovich (SZ) maps of novel galaxy clusters given only halo mass and mass accretion rate (MAR). The algorithm uses a conditional variational autoencoder (CVAE) in the form of a convolutional neural network and is trained with SZ maps generated from the IllustrisTNG simulation. Our method can reproduce many of the details of galaxy clusters that analytical models usually lack, such as internal structure and aspherical distribution of gas created by mergers, while achieving the same computational feasibility, allowing us to generate mock SZ maps for over 105 clusters in 30 s on a laptop. We show that the model is capable of generating novel clusters (i.e. not found in the training set) and that the model accurately reproduces the effects of mass and MAR on the SZ images, such as scatter, asymmetry, and concentration, in addition to modelling merging sub-clusters. This work demonstrates the viability of ML-based methods for producing the number of realistic, high-resolution maps of galaxy clusters necessary to achieve statistical constraints from future SZ surveys.

     
    more » « less
  4. Abstract

    Improving our fundamental understanding of complex heterocatalytic processes increasingly relies on electronic structure simulations and microkinetic models based on calculated energy differences. In particular, calculation of activation barriers, usually achieved through compute‐intensive saddle point search routines, remains a serious bottleneck in understanding trends in catalytic activity for highly branched reaction networks. Although the well‐known Brønsted‐Evans‐Polyani (BEP) scaling – a one‐feature linear regression model – has been widely applied in such microkinetic models, they still rely on calculated reaction energies and may not generalize beyond a single facet on a single class of materials, e. g., a terrace sites on transition metals. For highly branched and energetically shallow reaction networks, such as electrochemical CO2reduction or wastewater remediation, calculating even reaction energies on many surfaces can become computationally intractable due to the combinatorial explosion of states that must be considered. Here, we investigate the feasibility of activation barrier prediction without knowledge of the reaction energy using linear and nonlinear machine learning (ML) models trained on a new database of over 500 dehydrogenation activation barriers. We also find that inclusion of the reaction energy significantly improves both classes of ML models, but complex nonlinear models can achieve performance similar to the simplest BEP scaling when predicting activation barriers on new systems. Additionally, inclusion of the reaction energy significantly improves generalizability to new systems beyond the training set. Our results suggest that the reaction energy is a critical feature to consider when building models to predict activation barriers, indicating that efforts to reliably predict reaction energies through, e. g., the Open Catalyst Project and others, will be an important route to effective model development for more complex systems.

     
    more » « less
  5. Abstract

    Machine learning (ML) tools are able to learn relationships between the inputs and outputs of large complex systems directly from data. However, for time-varying systems, the predictive capabilities of ML tools degrade if the systems are no longer accurately represented by the data with which the ML models were trained. For complex systems, re-training is only possible if the changes are slow relative to the rate at which large numbers of new input-output training data can be non-invasively recorded. In this work, we present an approach to deep learning for time-varying systems that does not require re-training, but uses instead an adaptive feedback in the architecture of deep convolutional neural networks (CNN). The feedback is based only on available system output measurements and is applied in the encoded low-dimensional dense layers of the encoder-decoder CNNs. First, we develop an inverse model of a complex accelerator system to map output beam measurements to input beam distributions, while both the accelerator components and the unknown input beam distribution vary rapidly with time. We then demonstrate our method on experimental measurements of the input and output beam distributions of the HiRES ultra-fast electron diffraction (UED) beam line at Lawrence Berkeley National Laboratory, and showcase its ability for automatic tracking of the time varying photocathode quantum efficiency map. Our method can be successfully used to aid both physics and ML-based surrogate online models to provide non-invasive beam diagnostics.

     
    more » « less