skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Using a machine learning approach to determine the space group of a structure from the atomic pair distribution function
A method is presented for predicting the space group of a structure given a calculated or measured atomic pair distribution function (PDF) from that structure. The method utilizes machine learning models trained on more than 100 000 PDFs calculated from structures in the 45 most heavily represented space groups. In particular, a convolutional neural network (CNN) model is presented which yields a promising result in that it correctly identifies the space group among the top-6 estimates 91.9% of the time. The CNN model also successfully identifies space groups for 12 out of 15 experimental PDFs. Interesting aspects of the failed estimates are discussed, which indicate that the CNN is failing in similar ways as conventional indexing algorithms applied to conventional powder diffraction data. This preliminary success of the CNN model shows the possibility of model-independent assessment of PDF data on a wide class of materials.  more » « less
Award ID(s):
1740833
PAR ID:
10112800
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
Acta Crystallographica Section A Foundations and Advances
Volume:
75
Issue:
4
ISSN:
2053-2733
Page Range / eLocation ID:
633 to 643
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Machine learning models based on convolutional neural networks have been used for predicting space groups of crystal structures from their atomic pair distribution function (PDF). However, the PDFs used to train the model are calculated using a fixed set of parameters that reflect specific experimental conditions, and the accuracy of the model when given PDFs generated with different choices of these parameters is unknown. In this work, the results of the top-1 accuracy and top-6 accuracy are robust when applied to PDFs of different choices of experimental parameters r max , Q max , Q damp and atomic displacement parameters. 
    more » « less
  2. Many astrophysical analyses depend on estimates of redshifts (a proxy for distance) determined from photometric (i.e., imaging) data alone. Inaccurate estimates of photometric redshift uncertainties can result in large systematic errors. However, probability distribution outputs from many photometric redshift methods do not follow the frequentist definition of a Probability Density Function (PDF) for redshift -- i.e., the fraction of times the true redshift falls between two limits z1 and z2 should be equal to the integral of the PDF between these limits. Previous works have used the global distribution of Probability Integral Transform (PIT) values to re-calibrate PDFs, but offsetting inaccuracies in different regions of feature space can conspire to limit the efficacy of the method. We leverage a recently developed regression technique that characterizes the local PIT distribution at any location in feature space to perform a local re-calibration of photometric redshift PDFs. Though we focus on an example from astrophysics, our method can produce PDFs which are calibrated at all locations in feature space for any use case. 
    more » « less
  3. Many astrophysical analyses depend on estimates of redshifts (a proxy for distance) determined from photometric (i.e., imaging) data alone. Inaccurate estimates of photometric redshift uncertainties can result in large systematic errors. However, probability distribution outputs from many photometric redshift methods do not follow the frequentist definition of a Probability Density Function (PDF) for redshift — i.e., the fraction of times the true redshift falls between two limits z1 and z2 should be equal to the integral of the PDF between these limits. Previous works have used the global distribution of Probability Integral Transform (PIT) values to re-calibrate PDFs, but offsetting inaccuracies in different regions of feature space can conspire to limit the efficacy of the method. We leverage a recently developed regression technique that characterizes the local PIT distribution at any location in feature space to perform a local re-calibration of photometric redshift PDFs resulting in calibrated predictive distributions. Though we focus on an example from astrophysics, our method can produce predictive distributions which are calibrated at all locations in feature space for any use case. 
    more » « less
  4. null (Ed.)
    The development of new nanomaterials for energy technologies is dependent on understanding the intricate relation between material properties and atomic structure. It is, therefore, crucial to be able to routinely characterise the atomic structure in nanomaterials, and a promising method for this task is Pair Distribution Function (PDF) analysis. The PDF can be obtained through Fourier transformation of x-ray total scattering data, and represents a histogram of all interatomic distances in the sample. Going from the distance information in the PDF to a chemical structure is an unassigned distance geometry problem (uDGP), and solving this is often the bottleneck in nanostructure analysis. In this work, we propose to use a Conditional Variational Autoencoder (CVAE) to automatically solve the uDGP to obtain valid chemical structures from PDFs. We use a simple model system of hypothetical mono-metallic nanoparticles containing up to 100 atoms in the face centered cubic (FCC) structure as a proof of concept. The model is trained to predict the assigned distance matrix (aDM) from a simulated PDF of the structure as the conditional input. We introduce a novel representation of structures by projecting them inside a unit sphere and adding additional anchor points or satellites to help in the reconstruction of the chemical structure. The performance of the CVAE model is compared to a Deterministic Autoencoder (DAE) showing that both models are able to solve the uDGP reasonably well. We further show that the CVAE learns a structured and meaningful latent embedding space which can be used to predict new chemical structures. 
    more » « less
  5. Recently, there have been rapid developments in lattice-QCD calculations of proton structure, especially in the parton distribution functions (PDFs). We overcame a longstanding obstacle and for the first time in lattice-QCD are able to directly calculate the Bjorken- x dependence of the quark, helicity and transversity distributions. The PDFs are obtained using the large-momentum effective field theory (LaMET) framework where the full Bjorken- x dependence of finite-momentum PDFs, called “quasi-PDFs”, can be calculated on the lattice. The quasi-PDF nucleon matrix elements are renormalized non-perturbatively in RI/MOM-scheme. Following a nonperturbative renormalization of the parton quasi-distribution in a regularization-independent momentum-subtraction scheme, we establish its matching to the $$ \overline {{\rm{MS}}} $$ PDF and calculate the non-singlet matching coefficient at next-to-leading order in perturbation theory. In this proceeding, I will show the progress that has been made in recent years, highlighting the latest state-of-the art PDF calculations at the physical pion mass. Future impacts on the large- x global PDF fits are also discussed. 
    more » « less