skip to main content


The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 11:00 PM ET on Friday, April 12 until 2:00 AM ET on Saturday, April 13 due to maintenance. We apologize for the inconvenience.

Title: Predicting glass structure by physics-informed machine learning

Machine learning (ML) is emerging as a powerful tool to predict the properties of materials, including glasses. Informing ML models with knowledge of how glass composition affects short-range atomic structure has the potential to enhance the ability of composition-property models to extrapolate accurately outside of their training sets. Here, we introduce an approach wherein statistical mechanics informs a ML model that can predict the non-linear composition-structure relations in oxide glasses. This combined model offers an improved prediction compared to models relying solely on statistical physics or machine learning individually. Specifically, we show that the combined model accurately both interpolates and extrapolates the structure of Na2O–SiO2glasses. Importantly, the model is able to extrapolate predictions outside its training set, which is evidenced by the fact that it is able to predict the structure of a glass series that was kept fully hidden from the model during its training.

more » « less
Award ID(s):
1826420 1944510 1928538
Author(s) / Creator(s):
; ; ; ;
Publisher / Repository:
Nature Publishing Group
Date Published:
Journal Name:
npj Computational Materials
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Borates and borosilicates are potential candidates for the design and development of glass formulations with important industrial and technological applications. A major challenge that retards the pace of development of borate/borosilicate based glasses using predictive modeling is the lack of reliable computational models to predict the structure‐property relationships in these glasses over a wide compositional space. A major hindrance in this pursuit has been the complexity of boron‐oxygen bonding due to which it has been difficult to develop adequate B–O interatomic potentials. In this article, we have evaluated the performance of three B–O interatomic potential models recently developed by Bauchy et al [J.Non‐Cryst. Solids, 2018, 498, 294–304], Du et al [J. Am. Ceram. Soc.] and Edèn et al [Phys. Chem. Chem. Phys., 2018, 20, 8192–8209] aiming to reproduce the short‐to‐medium range structures of sodium borosilicate glasses in the system 25 Na2OxB2O3(75 − x) SiO2(x = 0‐75 mol%). To evaluate the different force fields, we have computed at the density functional theory level the NMR parameters of11B,23Na, and29Si of the models generated with the three potentials and the simulated MAS NMR spectra compared with the experimental counterparts. It was observed that the rigid ionic models proposed by Bauchy and Du can both reliably reproduce the partitioning between BO3and BO4species of the investigated glasses, along with the local environment around sodium in the glass structure. However, they do not accurately reproduce the second coordination sphere of silicon ions and the Si–O–T (T = Si, B) and B‐O‐T distribution angles in the investigated compositional space which strongly affect the NMR parameters and final spectral shape. On the other hand, the core‐shell parameterization model proposed by Edén underestimates the fraction of BO4species of the glass with composition 25Na2O 18.4B2O356.6SiO2but can accurately reproduce the shape of the11B and29Si MAS‐NMR spectra of the glasses investigations due to the narrower B–O–T and Si‐O‐T bond angle distributions. Finally, the effect of the number of boron atoms (also distinguishing the BO3and BO4units) in the second coordination sphere of the network former cations on the NMR parameters have been evaluated.

    more » « less
  2. Abstract

    Machine learning (ML) regression methods are promising tools to develop models predicting the properties of materials by learning from existing databases. However, although ML models are usually good at interpolating data, they often do not offer reliable extrapolations and can violate the laws of physics. Here, to address the limitations of traditional ML, we introduce a “topology-informed ML” paradigm—wherein some features of the network topology (rather than traditional descriptors) are used as fingerprint for ML models—and apply this method to predict the forward (stage I) dissolution rate of a series of silicate glasses. We demonstrate that relying on a topological description of the atomic network (i) increases the accuracy of the predictions, (ii) enhances the simplicity and interpretability of the predictive models, (iii) reduces the need for large training sets, and (iv) improves the ability of the models to extrapolate predictions far from their training sets. As such, topology-informed ML can overcome the limitations facing traditional ML (e.g., accuracy vs. simplicity tradeoff) and offers a promising route to predict the properties of materials in a robust fashion.

    more » « less

    We develop a machine-learning (ML) algorithm that generates high-resolution thermal Sunyaev–Zeldovich (SZ) maps of novel galaxy clusters given only halo mass and mass accretion rate (MAR). The algorithm uses a conditional variational autoencoder (CVAE) in the form of a convolutional neural network and is trained with SZ maps generated from the IllustrisTNG simulation. Our method can reproduce many of the details of galaxy clusters that analytical models usually lack, such as internal structure and aspherical distribution of gas created by mergers, while achieving the same computational feasibility, allowing us to generate mock SZ maps for over 105 clusters in 30 s on a laptop. We show that the model is capable of generating novel clusters (i.e. not found in the training set) and that the model accurately reproduces the effects of mass and MAR on the SZ images, such as scatter, asymmetry, and concentration, in addition to modelling merging sub-clusters. This work demonstrates the viability of ML-based methods for producing the number of realistic, high-resolution maps of galaxy clusters necessary to achieve statistical constraints from future SZ surveys.

    more » « less
  4. Abstract

    Machine learning (ML) tools are able to learn relationships between the inputs and outputs of large complex systems directly from data. However, for time-varying systems, the predictive capabilities of ML tools degrade if the systems are no longer accurately represented by the data with which the ML models were trained. For complex systems, re-training is only possible if the changes are slow relative to the rate at which large numbers of new input-output training data can be non-invasively recorded. In this work, we present an approach to deep learning for time-varying systems that does not require re-training, but uses instead an adaptive feedback in the architecture of deep convolutional neural networks (CNN). The feedback is based only on available system output measurements and is applied in the encoded low-dimensional dense layers of the encoder-decoder CNNs. First, we develop an inverse model of a complex accelerator system to map output beam measurements to input beam distributions, while both the accelerator components and the unknown input beam distribution vary rapidly with time. We then demonstrate our method on experimental measurements of the input and output beam distributions of the HiRES ultra-fast electron diffraction (UED) beam line at Lawrence Berkeley National Laboratory, and showcase its ability for automatic tracking of the time varying photocathode quantum efficiency map. Our method can be successfully used to aid both physics and ML-based surrogate online models to provide non-invasive beam diagnostics.

    more » « less
  5. The water content in the soil regulates exchanges between soil and atmosphere, impacts plant livelihood, and determines the antecedent condition for several natural hazards. Accurate soil moisture estimates are key to applications such as natural hazard prediction, agriculture, and water management. We explore how to best predict soil moisture at a high resolution in the context of a changing climate. Physics-based hydrological models are promising as they provide distributed soil moisture estimates and allow prediction outside the range of prior observations. This is particularly important considering that the climate is changing, and the available historical records are often too short to capture extreme events. Unfortunately, these models are extremely computationally expensive, which makes their use challenging, especially when dealing with strong uncertainties. These characteristics make them complementary to machine learning approaches, which rely on training data quality/quantity but are typically computationally efficient. We first demonstrate the ability of Convolutional Neural Networks (CNNs) to reproduce soil moisture fields simulated by the hydrological model ParFlow-CLM. Then, we show how these two approaches can be successfully combined to predict future droughts not seen in the historical timeseries. We do this by generating additional ParFlow-CLM simulations with altered forcing mimicking future drought scenarios. Comparing the performance of CNN models trained on historical forcing and CNN models trained also on simulations with altered forcing reveals the potential of combining these two approaches. The CNN can not only reproduce the moisture response to a given forcing but also learn and predict the impact of altered forcing. Given the uncertainties in projected climate change, we can create a limited number of representative ParFlow-CLM simulations (ca. 25 min/water year on 9 CPUs for our case study), train our CNNs, and use them to efficiently (seconds/water-year on 1 CPU) predict additional water years/scenarios and improve our understanding of future drought potential. This framework allows users to explore scenarios beyond past observation and tailor the training data to their application of interest (e.g., wet conditions for flooding, dry conditions for drought, etc…). With the trained ML model they can rely on high resolution soil moisture estimates and explore the impact of uncertainties.

    more » « less