skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Analysis of Interpretable Data Representations for 4D-STEM Using Unsupervised Learning
Abstract Understanding the structure of materials is crucial for engineering devices and materials with enhanced performance. Four-dimensional scanning transmission electron microscopy (4D-STEM) is capable of mapping nanometer-scale local crystallographic structure over micron-scale field of views. However, 4D-STEM datasets can contain tens of thousands of images from a wide variety of material structures, making it difficult to automate detection and classification of structures. Traditional automated analysis pipelines for 4D-STEM focus on supervised approaches, which require prior knowledge of the material structure and cannot describe anomalous or deviant structures. In this article, a pipeline for engineering 4D-STEM feature representations for unsupervised clustering using non-negative matrix factorization (NMF) is introduced. Each feature is evaluated using NMF and results are presented for both simulated and experimental data. It is shown that some data representations more reliably identify overlapping grains. Additionally, real space refinement is applied to identify spatially distinct sample regions, allowing for size and shape analysis to be performed. This work lays the foundation for improved analysis of nanoscale structural features in materials that deviate from expected crystallographic arrangement using 4D-STEM.  more » « less
Award ID(s):
1848079
PAR ID:
10403369
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Microscopy and Microanalysis
Volume:
28
Issue:
6
ISSN:
1431-9276
Page Range / eLocation ID:
1998 to 2008
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Understanding lattice deformations is crucial in determining the properties of nanomaterials, which can become more prominent in future applications ranging from energy harvesting to electronic devices. However, it remains challenging to reveal unexpected deformations that crucially affect material properties across a large sample area. Here, we demonstrate a rapid and semi-automated unsupervised machine learning approach to uncover lattice deformations in materials. Our method utilizes divisive hierarchical clustering to automatically unveil multi-scale deformations in the entire sample flake from the diffraction data using four-dimensional scanning transmission electron microscopy (4D-STEM). Our approach overcomes the current barriers of large 4D data analysis without a priori knowledge of the sample. Using this purely data-driven analysis, we have uncovered different types of material deformations, such as strain, lattice distortion, bending contour, etc., which can significantly impact the band structure and subsequent performance of nanomaterials-based devices. We envision that this data-driven procedure will provide insight into materials’ intrinsic structures and accelerate the discovery of materials. 
    more » « less
  2. Data-driven approaches to materials exploration and discovery are building momentum due to emerging advances in machine learning. However, parsimonious representations of crystals for navigating the vast materials search space remain limited. To address this limitation, we introduce a materials discovery framework that utilizes natural language embeddings from language models as representations of compositional and structural features. The contextual knowledge encoded in these language representations conveys information about material properties and structures, enabling both similarity analysis to recall relevant candidates based on a query material and multi-task learning to share information across related properties. Applying this framework to thermoelectrics, we demonstrate diversified recommendations of prototype crystal structures and identify under-studied material spaces. Validation through first-principles calculations and experiments confirms the potential of the recommended materials as high-performance thermoelectrics. Language-based frameworks offer versatile and adaptable embedding structures for effective materials exploration and discovery, applicable across diverse material systems. 
    more » « less
  3. ABSTRACT Non‐negative Matrix Factorization (NMF) is an effective algorithm for multivariate data analysis, including applications to feature selection, pattern recognition, and computer vision. Its variant, Semi‐Nonnegative Matrix Factorization (SNF), extends the ability of NMF to render parts‐based data representations to include mixed‐sign data. Graph Regularized SNF builds upon this paradigm by adding a graph regularization term to preserve the local geometrical structure of the data space. Despite their successes, SNF‐related algorithms to date still suffer from instability caused by the Frobenius norm due to the effects of outliers and noise. In this paper, we present a new SNF algorithm that utilizes the noise‐insensitive norm. We provide monotonic convergence analysis of the SNF algorithm. In addition, we conduct numerical experiments on three benchmark mixed‐sign datasets as well as several randomized mixed‐sign matrices to demonstrate the performance superiority of SNF over conventional SNF algorithms under the influence of Gaussian noise at different levels. 
    more » « less
  4. Abstract Material properties strongly depend on the nature and concentration of defects. Characterizing these features may require nano- to atomic-scale resolution to establish structure–property relationships. 4D-STEM, a technique where diffraction patterns are acquired at a grid of points on the sample, provides a versatile method for highlighting defects. Computational analysis of the diffraction patterns with virtual detectors produces images that can map material properties. Here, using multislice simulations, we explore different virtual detectors that can be applied to the diffraction patterns that go beyond the binary response functions that are possible using ordinary STEM detectors. Using graphene and lead titanate as model systems, we investigate the application of virtual detectors to study local order and in particular defects. We find that using a small convergence angle with a rotationally varying detector most efficiently highlights defect signals. With experimental graphene data, we demonstrate the effectiveness of these detectors in characterizing atomic features, including vacancies, as suggested in simulations. Phase and amplitude modification of the electron beam provides another process handle to change image contrast in a 4D-STEM experiment. We demonstrate how tailored electron beams can enhance signals from short-range order and how a vortex beam can be used to characterize local symmetry. 
    more » « less
  5. Yousefi, Bardia (Ed.)
    Mass spectrometry imaging (MSI) is a powerful scientific tool for understanding the spatial distribution of biochemical compounds in tissue structures. In this paper, we introduce three novel approaches in MSI data processing to perform the tasks of data augmentation, feature ranking, and image registration. We use these approaches in conjunction with non-negative matrix factorization (NMF) to resolve two of the biggest challenges in MSI data analysis, namely: 1) the large file sizes and associated computational resource requirements and 2) the complexity of interpreting the very high dimensional raw spectral data. There are many dimensionality reduction techniques that address the first challenge but do not necessarily result in readily interpretable features, leaving the second challenge unaddressed. We demonstrate that NMF is an effective dimensionality reduction algorithm that reduces the size of MSI datasets by three orders of magnitude with limited loss of information, yielding spatial and spectral components with meaningful correlation to tissue structure that may be used directly for subsequent data analysis without the need for additional clustering steps. This analysis is demonstrated on an MSI dataset from female Sprague-Dawley rats for an animal model of comorbid visceral pain hypersensitivity (CPH). We find that high-dimensional MSI data (∼ 100,000 ions per pixel) can be reduced to 20 spectral NMF components with < 20% loss in reconstruction accuracy. The resulting spatial NMF components are reproducible and correlate well with H&E-stained tissue images. These components may also be used to generate images with enhanced specificity for different tissue types. Small patches of NMF data (i.e., 20 spatial NMF components over 20 × 20 pixels) provide an accuracy of ∼ 87% in classifying CPH vs naïve control subjects. This paper presents the novel data processing methodologies that were used to produce these results, encompassing novel data processing pipelines for data augmentation to support training for classification, ranking of features according to their contribution to classification, and image registration to enhance tissue-specific imaging. 
    more » « less