skip to main content


Title: Generative Deep Neural Networks for Inverse Materials Design Using Backpropagation and Active Learning
Abstract

In recent years, machine learning (ML) techniques are seen to be promising tools to discover and design novel materials. However, the lack of robust inverse design approaches to identify promising candidate materials without exploring the entire design space causes a fundamental bottleneck. A general‐purpose inverse design approach is presented using generative inverse design networks. This ML‐based inverse design approach uses backpropagation to calculate the analytical gradients of an objective function with respect to design variables. This inverse design approach is capable of overcoming local minima traps by using backpropagation to provide rapid calculations of gradient information and running millions of optimizations with different initial values. Furthermore, an active learning strategy is adopted in the inverse design approach to improve the performance of candidate materials and reduce the amount of training data needed to do so. Compared to passive learning, the active learning strategy is capable of generating better designs and reducing the amount of training data by at least an order‐of‐magnitude in the case study on composite materials. The inverse design approach is compared with conventional gradient‐based topology optimization and gradient‐free genetic algorithms and the pros and cons of each method are discussed when applied to materials discovery and design problems.

 
more » « less
NSF-PAR ID:
10458279
Author(s) / Creator(s):
 ;  
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
Advanced Science
Volume:
7
Issue:
5
ISSN:
2198-3844
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Molecules composed of atoms exhibit properties not inherent to their constituent atoms. Similarly, metamolecules consisting of multiple meta‐atoms possess emerging features that the meta‐atoms themselves do not possess. Metasurfaces composed of metamolecules with spatially variant building blocks, such as gradient metasurfaces, are drawing substantial attention due to their unconventional controllability of the amplitude, phase, and frequency of light. However, the intricate mechanisms and the large degrees of freedom of the multielement systems impede an effective strategy for the design and optimization of metamolecules. Here, a hybrid artificial‐intelligence‐based framework consolidating compositional pattern‐producing networks and cooperative coevolution to resolve the inverse design of metamolecules in metasurfaces is proposed. The framework breaks the design of the metamolecules into separate designs of meta‐atoms, and independently solves the smaller design tasks of the meta‐atoms through deep learning and evolutionary algorithms. The proposed framework is leveraged to design metallic metamolecules for arbitrary manipulation of the polarization and wavefront of light. Moreover, the efficacy and reliability of the design strategy are confirmed through experimental validations. This framework reveals a promising candidate approach to expedite the design of large‐scale metasurfaces in a labor‐saving, systematic manner.

     
    more » « less
  2. Machine-learning (ML) approaches have proven to be of great utility in modern materials innovation pipelines. Generally, ML models are trained on predetermined past data and then used to make predictions for new test cases. Active-learning, however, is a paradigm in which ML models can direct the learning process itself through providing dynamic suggestions/queries for the “next-best experiment.” In this work, the authors demonstrate how an active-learning framework can aid in the discovery of polymers possessing high glass transition temperatures ( T g ). Starting from an initial small dataset of polymer T g measurements, the authors use Gaussian process regression in conjunction with an active-learning framework to iteratively add T g measurements of candidate polymers to the training dataset. The active-learning framework employs one of three decision making strategies (exploitation, exploration, or balanced exploitation/exploration) for selection of the “next-best experiment.” The active-learning workflow terminates once 10 polymers possessing a T g greater than a certain threshold temperature are selected. The authors statistically benchmark the performance of the aforementioned three strategies (against a random selection approach) with respect to the discovery of high- T g polymers for this particular demonstrative materials design challenge. 
    more » « less
  3. Abstract

    Machine learning provides a promising platform for both forward modeling and the inverse design of photonic structures. Relying on a data-driven approach, machine learning is especially appealing for situations when it is not feasible to derive an analytical solution for a complex problem. There has been a great amount of recent interest in constructing machine learning models suitable for different electromagnetic problems. In this work, we adapt a region-specified design approach for the inverse design of multilayered nanoparticles. Given the high computational cost of dataset generation for electromagnetic problems, we specifically investigate the case of a small training dataset, enhanced via random region specification in an inverse convolutional neural network. The trained model is used to design nanoparticles with high absorption levels and different ratios of absorption over scattering. The central design wavelength is shifted across 350–700 nm without re-training. We discuss the implications of wavelength, particle size, and the training dataset size on the performance of the model. Our approach may find interesting applications in the design of multilayer nanoparticles for biological, chemical, and optical applications as well as the design of low-scattering absorbers and antennas.

     
    more » « less
  4. Abstract

    Objective.Spiking neural networks (SNNs) are powerful tools that are well suited for brain machine interfaces (BMI) due to their similarity to biological neural systems and computational efficiency. They have shown comparable accuracy to state-of-the-art methods, but current training methods require large amounts of memory, and they cannot be trained on a continuous input stream without pausing periodically to perform backpropagation. An ideal BMI should be capable training continuously without interruption to minimize disruption to the user and adapt to changing neural environments.Approach.We propose a continuous SNN weight update algorithm that can be trained to perform regression learning with no need for storing past spiking events in memory. As a result, the amount of memory needed for training is constant regardless of the input duration. We evaluate the accuracy of the network on recordings of neural data taken from the premotor cortex of a primate performing reaching tasks. Additionally, we evaluate the SNN in a simulated closed loop environment and observe its ability to adapt to sudden changes in the input neural structure.Main results.The continuous learning SNN achieves the same peak correlation (ρ=0.7) as existing SNN training methods when trained offline on real neural data while reducing the total memory usage by 92%. Additionally, it matches state-of-the-art accuracy in a closed loop environment, demonstrates adaptability when subjected to multiple types of neural input disruptions, and is capable of being trained online without any prior offline training.Significance.This work presents a neural decoding algorithm that can be trained rapidly in a closed loop setting. The algorithm increases the speed of acclimating a new user to the system and also can adapt to sudden changes in neural behavior with minimal disruption to the user.

     
    more » « less
  5. SUMMARY

    The spectral element method is currently the method of choice for computing accurate synthetic seismic wavefields in realistic 3-D earth models at the global scale. However, it requires significantly more computational time, compared to normal mode-based approximate methods. Source stacking, whereby multiple earthquake sources are aligned on their origin time and simultaneously triggered, can reduce the computational costs by several orders of magnitude. We present the results of synthetic tests performed on a realistic radially anisotropic 3-D model, slightly modified from model SEMUCB-WM1 with three component synthetic waveform ‘data’ for a duration of 10 000 s, and filtered at periods longer than 60 s, for a set of 273 events and 515 stations. We consider two definitions of the misfit function, one based on the stacked records at individual stations and another based on station-pair cross-correlations of the stacked records. The inverse step is performed using a Gauss–Newton approach where the gradient and Hessian are computed using normal mode perturbation theory. We investigate the retrieval of radially anisotropic long wavelength structure in the upper mantle in the depth range 100–800 km, after fixing the crust and uppermost mantle structure constrained by fundamental mode Love and Rayleigh wave dispersion data. The results show good performance using both definitions of the misfit function, even in the presence of realistic noise, with degraded amplitudes of lateral variations in the anisotropic parameter ξ. Interestingly, we show that we can retrieve the long wavelength structure in the upper mantle, when considering one or the other of three portions of the cross-correlation time series, corresponding to where we expect the energy from surface wave overtone, fundamental mode or a mixture of the two to be dominant, respectively. We also considered the issue of missing data, by randomly removing a successively larger proportion of the available synthetic data. We replace the missing data by synthetics computed in the current 3-D model using normal mode perturbation theory. The inversion results degrade with the proportion of missing data, especially for ξ, and we find that a data availability of 45 per cent or more leads to acceptable results. We also present a strategy for grouping events and stations to minimize the number of missing data in each group. This leads to an increased number of computations but can be significantly more efficient than conventional single-event-at-a-time inversion. We apply the grouping strategy to a real picking scenario, and show promising resolution capability despite the use of fewer waveforms and uneven ray path distribution. Source stacking approach can be used to rapidly obtain a starting 3-D model for more conventional full-waveform inversion at higher resolution, and to investigate assumptions made in the inversion, such as trade-offs between isotropic, anisotropic or anelastic structure, different model parametrizations or how crustal structure is accounted for.

     
    more » « less