Abstract Data-driven design shows the promise of accelerating materials discovery but is challenging due to the prohibitive cost of searching the vast design space of chemistry, structure, and synthesis methods. Bayesian optimization (BO) employs uncertainty-aware machine learning models to select promising designs to evaluate, hence reducing the cost. However, BO with mixed numerical and categorical variables, which is of particular interest in materials design, has not been well studied. In this work, we survey frequentist and Bayesian approaches to uncertainty quantification of machine learning with mixed variables. We then conduct a systematic comparative study of their performances in BO using a popular representative model from each group, the random forest-based Lolo model (frequentist) and the latent variable Gaussian process model (Bayesian). We examine the efficacy of the two models in the optimization of mathematical functions, as well as properties of structural and functional materials, where we observe performance differences as related to problem dimensionality and complexity. By investigating the machine learning models’ predictive and uncertainty estimation capabilities, we provide interpretations of the observed performance differences. Our results provide practical guidance on choosing between frequentist and Bayesian uncertainty-aware machine learning models for mixed-variable BO in materials design.
more »
« less
Benchmarking inverse optimization algorithms for materials design
Machine learning-based inverse materials discovery has attracted enormous attention recently due to its flexibility in dealing with black box models. Yet, many metaheuristic algorithms are not as widely applied to materials discovery applications as machine learning methods. There are ongoing challenges in applying different optimization algorithms to discover materials with single- or multi-elemental compositions and how these algorithms differ in mining the ideal materials. We comprehensively compare 11 different optimization algorithms for the design of single- and multi-elemental crystals with targeted properties. By maximizing the bulk modulus and minimizing the Fermi energy through perturbing the parameterized elemental composition representations, we estimated the unique counts of elemental compositions, mean density scan of the objectives space, mean objectives, and frequency distributed over the materials’ representations and objectives. We found that nature-inspired algorithms contain more uncertainties in the defined elemental composition design tasks, which correspond to their dependency on multiple hyperparameters. Runge–Kutta optimization (RUN) exhibits higher mean objectives, whereas Bayesian optimization (BO) displayed low mean objectives compared with other methods. Combined with materials count and density scan, we propose that BO strives to approximate a more accurate surrogate of the design space by sampling more elemental compositions and hence have lower mean objectives, yet RUN will repeatedly sample the targeted elemental compositions with higher objective values. Our work sheds light on the automated digital design of materials with single- and multi-elemental compositions and is expected to elicit future studies on materials optimization, such as composite and alloy design based on specific desired properties.
more »
« less
- PAR ID:
- 10594916
- Publisher / Repository:
- American Institute of Physics
- Date Published:
- Journal Name:
- APL Materials
- Volume:
- 12
- Issue:
- 2
- ISSN:
- 2166-532X
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract Optimizing material compositions often enhances thermoelectric performances. However, the large selection of possible base elements and dopants results in a vast composition design space that is too large to systematically search using solely domain knowledge. To address this challenge, a hybrid data‐driven strategy that integrates Bayesian optimization (BO) and Gaussian process regression (GPR) is proposed to optimize the composition of five elements (Ag, Se, S, Cu, and Te) in AgSe‐based thermoelectric materials. Data is collected from the literature to provide prior knowledge for the initial GPR model, which is updated by actively collected experimental data during the iteration between BO and experiments. Within seven iterations, the optimized AgSe‐based materials prepared using a simple high‐throughput ink mixing and blade coating method deliver a high power factor of 2100 µW m−1K−2, which is a 75% improvement from the baseline composite (nominal composition of Ag2Se1). The success of this study provides opportunities to generalize the demonstrated active machine learning technique to accelerate the development and optimization of a wide range of material systems with reduced experimental trials.more » « less
-
Abstract Bayesian optimization (BO) is an indispensable tool to optimize objective functions that either do not have known functional forms or are expensive to evaluate. Currently, optimal experimental design is always conducted within the workflow of BO leading to more efficient exploration of the design space compared to traditional strategies. This can have a significant impact on modern scientific discovery, in particular autonomous materials discovery, which can be viewed as an optimization problem aimed at looking for the maximum (or minimum) point for the desired materials properties. The performance of BO-based experimental design depends not only on the adopted acquisition function but also on the surrogate models that help to approximate underlying objective functions. In this paper, we propose a fully autonomous experimental design framework that uses more adaptive and flexible Bayesian surrogate models in a BO procedure, namely Bayesian multivariate adaptive regression splines and Bayesian additive regression trees. They can overcome the weaknesses of widely used Gaussian process-based methods when faced with relatively high-dimensional design space or non-smooth patterns of objective functions. Both simulation studies and real-world materials science case studies demonstrate their enhanced search efficiency and robustness.more » « less
-
One-dimensional (1D) van der Waals (vdW) materials display electronic and magnetic transport properties that make them uniquely suited as interconnect materials and for low-dimensional optoelectronic applications. However, there are only around 700 1D vdW structures in general materials databases, making database curation approaches ineffective for 1D discovery. Here, we utilize machine-learning techniques to discover 1D vdW compositions that have not yet been synthesized. Our techniques go beyond discovery efforts involving elemental substitutions and instead start with a composition space of 4741 binary and 392,342 ternary formulas. We predict up to 3000 binary and 10,000 ternary 1D compounds and further classify them by expected magnetic and electronic properties. Our model identifies MoI3, a material we experimentally confirm to exist with wire-like subcomponents and exotic magnetic properties. More broadly, we find several chalcogen-, halogen-, and pnictogen-containing compounds expected to be synthesizable using chemical vapor deposition and chemical vapor transport.more » « less
-
null (Ed.)In the alloy materials, their mechanical properties mightly rely on the compositions and concentrations of chemical elements. Therefore, looking for the optimum elemental concentration and composition is still a critical issue to design high-performance alloy materials. Traditional alloy designing method via “trial and error” or domain experts’ experiences is barely possible to solve the issue. Here, we propose a “composition-oriented” method combined machine learning to design the Cu–Zn alloys with the high strengths, high ductility, and low friction coefficient. The method of separate training for each attribute label is used to study the effects of elemental concentrations on the mechanical properties of Cu–Zn alloys. Moreover, the elemental concentrations of new Cu–Zn alloys with the good mechanical properties are predicted by machine learning. The current results reveal the vital importance of the “composition-oriented” design method via machine learning for the development of high-performance alloys in a broad range of elemental compositions.more » « less
An official website of the United States government
