To bolster the accuracy of existing methods for automated phase identification from X-ray diffraction (XRD) patterns, we introduce a machine learning approach that uses a dual representation whereby XRD patterns are augmented with simulated pair distribution functions (PDFs). A convolutional neural network is trained directly on XRD patterns calculated using physics-informed data augmentation, which accounts for experimental artifacts such as lattice strain and crystallographic texture. A second network is trained on PDFs generated
Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Abstract via Fourier transform of the augmented XRD patterns. At inference, these networks classify unknown samples by aggregating their predictions in a confidence-weighted sum. We show that such an integrated approach to phase identification provides enhanced accuracy by leveraging the benefits of each model’s input representation. Whereas networks trained on XRD patterns provide a reciprocal space representation and can effectively distinguish large diffraction peaks in multi-phase samples, networks trained on PDFs provide a real space representation and perform better when peaks with low intensity become important. These findings underscore the importance of using diverse input representations for machine learning models in materials science and point to new avenues for automating multi-modal characterization. -
Synthesis prediction is a key accelerator for the rapid design of advanced materials. However, determining synthesis variables such as the choice of precursor materials is challenging for inorganic materials because the sequence of reactions during heating is not well understood. In this work, we use a knowledge base of 29,900 solid-state synthesis recipes, text-mined from the scientific literature, to automatically learn which precursors to recommend for the synthesis of a novel target material. The data-driven approach learns chemical similarity of materials and refers the synthesis of a new target to precedent synthesis procedures of similar materials, mimicking human synthesis design. When proposing five precursor sets for each of 2654 unseen test target materials, the recommendation strategy achieves a success rate of at least 82%. Our approach captures decades of heuristic synthesis data in a mathematical form, making it accessible for use in recommendation engines and autonomous laboratories.more » « less
-
Abstract Phase diagrams offer substantial predictive power for materials synthesis by identifying the stability regions of target phases. However, thermodynamic phase diagrams do not offer explicit information regarding the kinetic competitiveness of undesired by-product phases. Here we propose a quantitative and computable thermodynamic metric to identify synthesis conditions under which the propensity to form kinetically competing by-products is minimized. We hypothesize that thermodynamic competition is minimized when the difference in free energy between a target phase and the minimal energy of all other competing phases is maximized. We validate this hypothesis for aqueous materials synthesis through two empirical approaches: first, by analysing 331 aqueous synthesis recipes text-mined from the literature; and second, by systematic experimental synthesis of LiIn(IO3)4and LiFePO4across a wide range of aqueous electrochemical conditions. Our results show that even for synthesis conditions that are within the stability region of a thermodynamic Pourbaix diagram, phase-pure synthesis occurs only when thermodynamic competition with undesired phases is minimized.
-
Free, publicly-accessible full text available October 25, 2024
-
Abstract Machine learning (ML) has become a valuable tool to assist and improve materials characterization, enabling automated interpretation of experimental results with techniques such as X-ray diffraction (XRD) and electron microscopy. Because ML models are fast once trained, there is a key opportunity to bring interpretation in-line with experiments and make on-the-fly decisions to achieve optimal measurement effectiveness, which creates broad opportunities for rapid learning and information extraction from experiments. Here, we demonstrate such a capability with the development of autonomous and adaptive XRD. By coupling an ML algorithm with a physical diffractometer, this method integrates diffraction and analysis such that early experimental information is leveraged to steer measurements toward features that improve the confidence of a model trained to identify crystalline phases. We validate the effectiveness of an adaptive approach by showing that ML-driven XRD can accurately detect trace amounts of materials in multi-phase mixtures with short measurement times. The improved speed of phase detection also enables in situ identification of short-lived intermediate phases formed during solid-state reactions using a standard in-house diffractometer. Our findings showcase the advantages of in-line ML for materials characterization and point to the possibility of more general approaches for adaptive experimentation.
-
Applying AI power to predict syntheses of novel materials requires high-quality, large-scale datasets. Extraction of synthesis information from scientific publications is still challenging, especially for extracting synthesis actions, because of the lack of a comprehensive labeled dataset using a solid, robust, and well-established ontology for describing synthesis procedures. In this work, we propose the first unified language of synthesis actions (ULSA) for describing inorganic synthesis procedures. We created a dataset of 3040 synthesis procedures annotated by domain experts according to the proposed ULSA scheme. To demonstrate the capabilities of ULSA, we built a neural network-based model to map arbitrary inorganic synthesis paragraphs into ULSA and used it to construct synthesis flowcharts for synthesis procedures. Analysis of the flowcharts showed that (a) ULSA covers essential vocabulary used by researchers when describing synthesis procedures and (b) it can capture important features of synthesis protocols. The present work focuses on the synthesis protocols for solid-state, sol–gel, and solution-based inorganic synthesis, but the language could be extended in the future to include other synthesis methods. This work is an important step towards creating a synthesis ontology and a solid foundation for autonomous robotic synthesis.more » « less
-
Abstract The development of a materials synthesis route is usually based on heuristics and experience. A possible new approach would be to apply data-driven approaches to learn the patterns of synthesis from past experience and use them to predict the syntheses of novel materials. However, this route is impeded by the lack of a large-scale database of synthesis formulations. In this work, we applied advanced machine learning and natural language processing techniques to construct a dataset of 35,675 solution-based synthesis procedures extracted from the scientific literature. Each procedure contains essential synthesis information including the precursors and target materials, their quantities, and the synthesis actions and corresponding attributes. Every procedure is also augmented with the reaction formula. Through this work, we are making freely available the first large dataset of solution-based inorganic materials synthesis procedures.