skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Predicting compositional changes of organic–inorganic hybrid materials with Augmented CycleGAN
Despite its simplicity, the composition of a material can be used as input to machine learning models to predict a range of materials properties. However, many property optimization tasks require the generation of novel but realistic materials compositions. In this study, we describe a way to generate compositions of hybrid organic–inorganic crystals through adapting Augmented CycleGAN, a novel generative model that can learn many-to-many relations between two domains. Specifically, we investigate the problem of composition change upon amine swap: for a specific chemical system (set of elements) crystalized with amine A, how would the product chemical compositions change if it is crystalized with amine B? By training with limited data from Cambridge Structural Database, our model can generate realistic chemical compositions for hybrid crystalline materials. The Augmented CycleGAN model can also utilize abundant unpaired data (compositions of different chemical systems), a feature that traditional supervised methods lack. The generated compositions can be used for many tasks, for example, as input fed to a classifier that predicts structural dimensionality.  more » « less
Award ID(s):
2018427 1928882
PAR ID:
10358225
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Digital Discovery
Volume:
1
Issue:
3
ISSN:
2635-098X
Page Range / eLocation ID:
255 to 265
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Supervised deep-learning models have enabled super-resolution imaging in several microscopic imaging modalities, increasing the spatial lateral bandwidth of the original input images beyond the diffraction limit. Despite their success, their practical application poses several challenges in terms of the amount of training data and its quality, requiring the experimental acquisition of large, paired databases to generate an accurate generalized model whose performance remains invariant to unseen data. Cycle-consistent generative adversarial networks (cycleGANs) are unsupervised models for image-to-image translation tasks that are trained on unpaired datasets. This paper introduces a cycleGAN framework specifically designed to increase the lateral resolution limit in confocal microscopy by training a cycleGAN model using low- and high-resolution unpaired confocal images of human glioblastoma cells. Training and testing performances of the cycleGAN model have been assessed by measuring specific metrics such as background standard deviation, peak-to-noise ratio, and a customized frequency content measure. Our cycleGAN model has been evaluated in terms of image fidelity and resolution improvement using a paired dataset, showing superior performance than other reported methods. This work highlights the efficacy and promise of cycleGAN models in tackling super-resolution microscopic imaging without paired training, paving the path for turning home-built low-resolution microscopic systems into low-cost super-resolution instruments by means of unsupervised deep learning. 
    more » « less
  2. null (Ed.)
    Abstract Accurate theoretical predictions of desired properties of materials play an important role in materials research and development. Machine learning (ML) can accelerate the materials design by building a model from input data. For complex datasets, such as those of crystalline compounds, a vital issue is how to construct low-dimensional representations for input crystal structures with chemical insights. In this work, we introduce an algebraic topology-based method, called atom-specific persistent homology (ASPH), as a unique representation of crystal structures. The ASPH can capture both pairwise and many-body interactions and reveal the topology-property relationship of a group of atoms at various scales. Combined with composition-based attributes, ASPH-based ML model provides a highly accurate prediction of the formation energy calculated by density functional theory (DFT). After training with more than 30,000 different structure types and compositions, our model achieves a mean absolute error of 61 meV/atom in cross-validation, which outperforms previous work such as Voronoi tessellations and Coulomb matrix method using the same ML algorithm and datasets. Our results indicate that the proposed topology-based method provides a powerful computational tool for predicting materials properties compared to previous works. 
    more » « less
  3. Abstract Amphiphilic copolymers (AP) represent a class of novel antibiofouling materials whose chemistry and composition can be tuned to optimize their performance. However, the enormous chemistry‐composition design space associated with AP makes their performance optimization laborious; it is not experimentally feasible to assess and validate all possible AP compositions even with the use of rapid screening methodologies. To address this constraint, a robust model development paradigm is reported, yielding a versatile machine learning approach that accurately predicts biofilm formation by Pseudomonas aeruginosa on a library of AP. The model excels in extracting underlying patterns in a “pooled” dataset from various experimental sources, thereby expanding the design space accessible to the model to a much larger selection of AP chemistries and compositions. The model is used to screen virtual libraries of AP for identification of best‐performing candidates for experimental validation. Initiated chemical vapor deposition is used for the precision synthesis of the model‐selected AP chemistries and compositions for validation at solid–liquid interface (often used in conventional antifouling studies) as well as the air–liquid–solid triple interface. Despite the vastly different growth conditions, the model successfully identifies the best‐performing AP for biofilm inhibition at the triple interface. 
    more » « less
  4. null (Ed.)
    Abstract Automatic recognition of unique characteristics of an object can provide a powerful solution to verify its authenticity and safety. It can mitigate the growth of one of the largest underground industries—that of counterfeit goods–flowing through the global supply chain. In this article, we propose the novel concept of material biometrics , in which the intrinsic chemical properties of structural materials are used to generate unique identifiers for authenticating individual products. For this purpose, the objects to be protected are modified via programmable additive manufacturing of built-in chemical “tags” that generate signatures depending on their chemical composition, quantity, and location. We report a material biometrics-enabled manufacturing flow in which plastic objects are protected using spatially-distributed tags that are optically invisible and difficult to clone. The resulting multi-bit signatures have high entropy and can be non-invasively detected for product authentication using $$^{35}$$ 35 Cl nuclear quadrupole resonance (NQR) spectroscopy. 
    more » « less
  5. There has been a significant interest in applying programming-by-example to automate repetitive and tedious tasks. However, due to the incomplete nature of input-output examples, a synthesizer may generate programs that pass the examples but do not match the user intent. In this paper, we propose MARS, a novel synthesis framework that takes as input a multi-layer specification composed by input-output examples, textual description, and partial code snippets that capture the user intent. To accurately capture the user intent from the noisy and ambiguous description, we propose a hybrid model that combines the power of an LSTM-based sequence-to-sequence model with the apriori algorithm for mining association rules through unsupervised learning. We reduce the problem of solving a multi-layer specification synthesis to a Max-SMT problem, where hard constraints encode well-typed concrete programs and soft constraints encode the user intent learned by the hybrid model. We instantiate our hybrid model to the data wrangling domain and compare its performance against Morpheus, a state-of-the-art synthesizer for data wrangling tasks. Our experiments demonstrate that our approach outperforms MORPHEUS in terms of running time and solved benchmarks. For challenging benchmarks, our approach can suggest candidates with rankings that are an order of magnitude better than MORPHEUS which leads to running times that are 15x faster than MORPHEUS. 
    more » « less