skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Quantifying Pairwise Similarity for Complex Polymers
Defining the similarity between chemical entities is an essential task in polymer informatics, enabling ranking, clustering, and classification. Despite its importance, the pairwise chemical similarity of polymers remains an open problem. Here, a similarity function for polymers with well-defined backbones is designed based on polymers’ stochastic graph representations generated from canonical BigSMILES, a structurally based line notation for describing macromolecules. The stochastic graph representations are separated into three parts: repeat units, end groups, and polymer topology. The earth mover’s distance is utilized to calculate the similarity of the repeat units and end groups, while the graph edit distance is used to calculate the similarity of the topology. These three values can be linearly or nonlinearly combined to yield an overall pairwise chemical similarity score for polymers that is largely consistent with the chemical intuition of expert users and is adjustable based on the relative importance of different chemical features for a given similarity problem. This method gives a reliable solution to quantitatively calculate the pairwise chemical similarity score for polymers and represents a vital step toward building search engines and quantitative design tools for polymer data.  more » « less
Award ID(s):
2134795
PAR ID:
10479660
Author(s) / Creator(s):
; ; ; ; ; ; ;
Publisher / Repository:
ACS Publications
Date Published:
Journal Name:
Macromolecules
Volume:
56
Issue:
18
ISSN:
0024-9297
Page Range / eLocation ID:
7344 to 7357
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Molecular search is important in chemistry, biology, and informatics for identifying molecular structures within large data sets, improving knowledge discovery and innovation, and making chemical data FAIR (findable, accessible, interoperable, reusable). Search algorithms for polymers are significantly less developed than those for small molecules because polymer search relies on searching by polymer name, which can be challenging because polymer naming is overly broad (i.e., polyethylene), complicated for complex chemical structures, and often does not correspond to official IUPAC conventions. Chemical structure search in polymers is limited to substructures, such as monomers, without awareness of connectivity or topology. This work introduces a novel query language and graph traversal search algorithm for polymers that provides the first search method able to fully capture all of the chemical structures present in polymers. The BigSMARTS query language, an extension of the small-molecule SMARTS language, allows users to write queries that localize monomer and functional group searches to different parts of the polymer, like the middle block of a triblock, the side chain of a graft, and the backbone of a repeat unit. The substructure search algorithm is based on the traversal of graph representations of the generating functions for the stochastic graphs of polymers. Operationally, the algorithm first identifies cycles representing the monomers and then the end groups and finally performs a depth-first search to match entire subgraphs. To validate the algorithm, hundreds of queries were searched against hundreds of target chemistries and topologies from the literature, with approximately 440,000 query–target pairs. This tool provides a detailed algorithm that can be implemented in search engines to provide search results with full matching of the monomer connectivity and polymer topology. 
    more » « less
  2. We propose a chemical language processing model to predict polymers’ glass transition temperature (Tg) through a polymer language (SMILES, Simplified Molecular Input Line Entry System) embedding and recurrent neural network. This model only receives the SMILES strings of a polymer’s repeat units as inputs and considers the SMILES strings as sequential data at the character level. Using this method, there is no need to calculate any additional molecular descriptors or fingerprints of polymers, and thereby, being very computationally efficient. More importantly, it avoids the difficulties to generate molecular descriptors for repeat units containing polymerization point ‘*’. Results show that the trained model demonstrates reasonable prediction performance on unseen polymer’s Tg. Besides, this model is further applied for high-throughput screening on an unlabeled polymer database to identify high-temperature polymers that are desired for applications in extreme environments. Our work demonstrates that the SMILES strings of polymer repeat units can be used as an effective feature representation to develop a chemical language processing model for predictions of polymer Tg. The framework of this model is general and can be used to construct structure–property relationships for other polymer properties. 
    more » « less
  3. ABSTRACT Carbohydrates are the fundamental building blocks of many natural polymers, their wide bioavailability, high chemical functionality, and stereochemical diversity make them attractive starting materials for the development of new synthetic polymers. In this work, one such carbohydrate,d‐glucopyranoside, was utilized to produce a hydrophobic five‐membered cyclic carbonate monomer to afford sugar‐based amphiphilic copolymers and block copolymers via organocatalyzed ring‐opening polymerizations with 4‐methylbenzyl alcohol and methoxy poly(ethylene glycol) as initiator and macroinitiator, respectively. To modulate the amphiphilicities of these polymers acidic benzylidene cleavage reactions were performed to deprotect the sugar repeat units and present hydrophilic hydroxyl side chain groups. Assembly of the polymers under aqueous conditions revealed interesting morphological differences, based on the polymer molar mass and repeat unit composition. The initial polymers, prior to the removal of the benzylidenes, underwent a morphological change from micelles to vesicles as the sugar block length was increased, causing a decrease in the hydrophilic–hydrophobic ratio. Deprotection of the sugar block increased the hydrophilicity and gave micellar morphologies. This tunable polymeric platform holds promise for the production of advanced materials for implementation in a diverse range of applications. © 2018 Wiley Periodicals, Inc. J. Polym. Sci., Part A: Polym. Chem.2019,57, 432–440 
    more » « less
  4. Abstract Polymers with low ceiling temperatures (Tc) are highly desirable as they can depolymerize under mild conditions, but they typically suffer from demanding synthetic conditions and poor stability. We envision that this challenge can be addressed by developing high-Tcpolymers that can be converted into low-Tcpolymers on demand. Here, we demonstrate the mechanochemical generation of a low-Tcpolymer, poly(2,5-dihydrofuran) (PDHF), from an unsaturated polyether that contains cyclobutane-fused THF in each repeat unit. Upon mechanically induced cycloreversion of cyclobutane, each repeat unit generates three repeat units of PDHF. The resulting PDHF completely depolymerizes into 2,5-dihydrofuran in the presence of a ruthenium catalyst. The mechanochemical generation of the otherwise difficult-to-synthesize PDHF highlights the power of polymer mechanochemistry in accessing elusive structures. The concept of mechanochemically regulating theTcof polymers can be applied to develop next-generation sustainable plastics. 
    more » « less
  5. π-Conjugated polymers are materials of interest for use in organic electronics. Within these polymers, donor–acceptor polymers are favorable for solar cell applications due to improved charge mobility, better absorption in the low energy region of the solar spectrum, and tunable band gaps. One of the barriers to commercializing these donor–acceptor materials is that their synthetic pathways are complex because of the alternating repeat units in the polymer. To address this, the application of cross dehydrogenative coupling (also called oxidative CH/CH cross-coupling) toward the synthesis of donor–acceptor polymers was explored. In this work, the roles of specific reagents in a one-pot gold- and silver-catalyzed cross dehydrogenative coupling and the factors that contribute to selectivity for cross-coupling rather than homo-coupling are analyzed. Based on our results, we postulate that the percentage of alternating repeat units in the final polymer is affected by the increased reactivity of the dimer that forms in the initial stages of the polymerization compared to the monomer, which ultimately may be exploited to control the ratio of electron-rich to electron-poor monomers. 
    more » « less