skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Predicting Polymers’ Glass Transition Temperature by a Chemical Language Processing Model
We propose a chemical language processing model to predict polymers’ glass transition temperature (Tg) through a polymer language (SMILES, Simplified Molecular Input Line Entry System) embedding and recurrent neural network. This model only receives the SMILES strings of a polymer’s repeat units as inputs and considers the SMILES strings as sequential data at the character level. Using this method, there is no need to calculate any additional molecular descriptors or fingerprints of polymers, and thereby, being very computationally efficient. More importantly, it avoids the difficulties to generate molecular descriptors for repeat units containing polymerization point ‘*’. Results show that the trained model demonstrates reasonable prediction performance on unseen polymer’s Tg. Besides, this model is further applied for high-throughput screening on an unlabeled polymer database to identify high-temperature polymers that are desired for applications in extreme environments. Our work demonstrates that the SMILES strings of polymer repeat units can be used as an effective feature representation to develop a chemical language processing model for predictions of polymer Tg. The framework of this model is general and can be used to construct structure–property relationships for other polymer properties.  more » « less
Award ID(s):
1934829
PAR ID:
10311629
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Polymers
Volume:
13
Issue:
11
ISSN:
2073-4360
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. This paper develops a machine learning methodology for the rapid and robust prediction of the glass transition temperature (Tg) for polymers for the targeted application of sustainable high-temperature polymers. The machine learning framework combines multiple techniques to develop a feature set encompassing all relative aspects of polymer chemistry, to extract and explain correlations between features and Tg, and to develop and apply a high-throughput predictive model. In this work, we identify aspects of the chemistry that most impact Tg, including a parameter related to rotational degrees of freedom and a backbone index based on a steric hindrance parameter. Building on this scientific understanding, models are developed on different types of data to ensure robustness, and experimental validation is obtained through the testing of new polymer chemistry with remarkable Tg. The ability of our model to predict Tg shows that the relevant information is contained within the topological descriptors, while the requirement of non-linear manifold transformation of the data also shows that the relationships are complex and cannot be captured through traditional regression approaches. Building on the scientific understanding obtained from the correlation analyses, coupled with the model performance, it is shown that the rigidity and interaction dynamics of the polymer structure are key to tuning for achieving targeted performance. This work has implications for future rapid optimization of chemistries 
    more » « less
  2. Defining the similarity between chemical entities is an essential task in polymer informatics, enabling ranking, clustering, and classification. Despite its importance, the pairwise chemical similarity of polymers remains an open problem. Here, a similarity function for polymers with well-defined backbones is designed based on polymers’ stochastic graph representations generated from canonical BigSMILES, a structurally based line notation for describing macromolecules. The stochastic graph representations are separated into three parts: repeat units, end groups, and polymer topology. The earth mover’s distance is utilized to calculate the similarity of the repeat units and end groups, while the graph edit distance is used to calculate the similarity of the topology. These three values can be linearly or nonlinearly combined to yield an overall pairwise chemical similarity score for polymers that is largely consistent with the chemical intuition of expert users and is adjustable based on the relative importance of different chemical features for a given similarity problem. This method gives a reliable solution to quantitatively calculate the pairwise chemical similarity score for polymers and represents a vital step toward building search engines and quantitative design tools for polymer data. 
    more » « less
  3. Side chain alkyl groups have become the standard for incorporating solubilizing groups into conjugated polymers. However, the variety of alkyl groups available and their location on the polymer’s backbone can contribute to the packing of the polymer chains in many different ways, resulting in many different morphologies in the polymer that can affect its properties and performances. In this paper, we investigate the effects on the conductivity of nine phenothiazine-containing polyaniline derivatives (P1−P9) with alkyl or aryl side chains on the phenothiazine core while also varying the number of methyl groups on the p-phenylenediamine unit. 1H nuclear magnetic resonance spectroscopy, ultraviolet−visible spectroscopy, differential scanning calorimetry, scanning electron microscopy, atomic force microscopy, and wide-angle X-ray scattering (WAXS) were all used to study the polymers’ structures, physical and thermal properties, and morphologies. The t-butylphenyl substituent on the phenothiazine core seems to provide more rigidity in the polymer’s backbone resulting in higher Tg for series 3, while series 2 containing the 2-hexyldecyl-substituted polymers had the lowest Tg, which is attributed to the large volume of the side chain, that limits interchain interactions. Consequently, series 2 had the lowest conductivity. However, the strongest effect on the conductivity was seen from the tetramethyl groups on the PPDA unit, which resulted in the lowest conductivity in each series due to torsional strain (twisting) in the polymer’s backbone. The WAXS data suggest mostly amorphous films; thus, the conductivity in these materials seems to be dominated by a multiscale charge transport phenomenon that occurs in amorphous conjugated materials. Our results will aid in the understanding of side chain engineering of PANI derivatives for their optimum performances. 
    more » « less
  4. Thiol-ene polymers are a promising class of biomaterials with a wide range of potential applications, including organs-on-a-chip, microfluidics, drug delivery, and wound healing. These polymers offer flexibility, softening, and shape memory properties. However, they often lack the inherent stretchability required for wearable or implantable devices. This study investigated the incorporation of di-acrylate chain extenders to improve the stretchability and conformability of those flexible thiol-ene polymers. Thiol-ene/acrylate polymers were synthesized using 1,3,5-triallyl-1,3,5-triazine-2,4,6(1H,3H,5H)-trione (TATATO), Trimethylolpropanetris (3-mercaptopropionate) (TMTMP), and Polyethylene Glycol Diacrylate (PEGDA) with different molecular weights (Mn 250 and Mn 575). Fourier Transform Infrared (FTIR) spectroscopy confirmed the complete reaction among the monomers. Uniaxial tensile testing demonstrated the softening and stretching capability of the polymers. The Young’s Modulus dropped from 1.12 GPa to 260 MPa upon adding 5 wt% PEGDA 575, indicating that the polymer softened. The Young’s Modulus was further reduced to 15 MPa under physiologic conditions. The fracture strain, a measure of stretchability, increased from 55% to 92% with the addition of 5 wt% PEGDA 575. A thermomechanical analysis further confirmed that PEGDA could be used to tune the polymer’s glass transition temperature (Tg). Moreover, our polymer exhibited shape memory properties. Our results suggested that thiol-ene/acrylate polymers are a promising new class of materials for biomedical applications requiring flexibility, stretchability, and shape memory properties. 
    more » « less
  5. Molecular search is important in chemistry, biology, and informatics for identifying molecular structures within large data sets, improving knowledge discovery and innovation, and making chemical data FAIR (findable, accessible, interoperable, reusable). Search algorithms for polymers are significantly less developed than those for small molecules because polymer search relies on searching by polymer name, which can be challenging because polymer naming is overly broad (i.e., polyethylene), complicated for complex chemical structures, and often does not correspond to official IUPAC conventions. Chemical structure search in polymers is limited to substructures, such as monomers, without awareness of connectivity or topology. This work introduces a novel query language and graph traversal search algorithm for polymers that provides the first search method able to fully capture all of the chemical structures present in polymers. The BigSMARTS query language, an extension of the small-molecule SMARTS language, allows users to write queries that localize monomer and functional group searches to different parts of the polymer, like the middle block of a triblock, the side chain of a graft, and the backbone of a repeat unit. The substructure search algorithm is based on the traversal of graph representations of the generating functions for the stochastic graphs of polymers. Operationally, the algorithm first identifies cycles representing the monomers and then the end groups and finally performs a depth-first search to match entire subgraphs. To validate the algorithm, hundreds of queries were searched against hundreds of target chemistries and topologies from the literature, with approximately 440,000 query–target pairs. This tool provides a detailed algorithm that can be implemented in search engines to provide search results with full matching of the monomer connectivity and polymer topology. 
    more » « less