skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Predicting glass transition temperature and melting point of organic compounds via machine learning and molecular embeddings
Gas-particle partitioning of secondary organic aerosols is impacted by particle phase state and viscosity, which can be inferred from the glass transition temperature ( T g ) of the constituting organic compounds. Several parametrizations were developed to predict T g of organic compounds based on molecular properties and elemental composition, but they are subject to relatively large uncertainties as they do not account for molecular structure and functionality. Here we develop a new T g prediction method powered by machine learning and “molecular embeddings”, which are unique numerical representations of chemical compounds that retain information on their structure, inter atomic connectivity and functionality. We have trained multiple state-of-the-art machine learning models on databases of experimental T g of organic compounds and their corresponding molecular embeddings. The best prediction model is the tgBoost model built with an Extreme Gradient Boosting (XGBoost) regressor trained via a nested cross-validation method, reproducing experimental data very well with a mean absolute error of 18.3 K. It can also quantify the influence of number and location of functional groups on the T g of organic molecules, while accounting for atom connectivity and predicting different T g for compositional isomers. The tgBoost model suggests the following trend for sensitivity of T g to functional group addition: –COOH (carboxylic acid) > –C(O)OR (ester) ≈ –OH (alcohol) > –C(O)R (ketone) ≈ –COR (ether) ≈ –C(O)H (aldehyde). We also developed a model to predict the melting point ( T m ) of organic compounds by training a deep neural network on a large dataset of experimental T m . The model performs reasonably well against the available dataset with a mean absolute error of 31.0 K. These new machine learning powered models can be applied to field and laboratory measurements as well as atmospheric aerosol models to predict the T g and T m of SOA compounds for evaluation of the phase state and viscosity of SOA.  more » « less
Award ID(s):
1654104
PAR ID:
10328994
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Environmental Science: Atmospheres
Volume:
2
Issue:
3
ISSN:
2634-3606
Page Range / eLocation ID:
362 to 374
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Secondary organic aerosol (SOA) accounts for a large fraction of submicron particles in the atmosphere. SOA can occur in amorphous solid or semi-solid phase states depending on chemical composition, relative humidity (RH), and temperature. The phase transition between amorphous solid and semi-solid states occurs at the glass transition temperature (Tg). We have recently developed a method to estimate Tg of pure compounds containing carbon, hydrogen, and oxygen atoms (CHO compounds) with molar mass less than 450 g mol−1 based on their molar mass and atomic O : C ratio. In this study, we refine and extend this method for CH and CHO compounds with molar mass up to ∼ 1100 g mol−1 using the number of carbon, hydrogen, and oxygen atoms. We predict viscosity from the Tg-scaled Arrhenius plot of fragility (viscosity vs. TgT) as a function of the fragility parameter D. We compiled D values of organic compounds from the literature and found that D approaches a lower limit of ∼ 10 (±1.7) as the molar mass increases. We estimated the viscosity of α-pinene and isoprene SOA as a function of RH by accounting for the hygroscopic growth of SOA and applying the Gordon–Taylor mixing rule, reproducing previously published experimental measurements very well. Sensitivity studies were conducted to evaluate impacts of Tg, D, the hygroscopicity parameter (κ), and the Gordon–Taylor constant on viscosity predictions. The viscosity of toluene SOA was predicted using the elemental composition obtained by high-resolution mass spectrometry (HRMS), resulting in a good agreement with the measured viscosity. We also estimated the viscosity of biomass burning particles using the chemical composition measured by HRMS with two different ionization techniques: electrospray ionization (ESI) and atmospheric pressure photoionization (APPI). Due to differences in detected organic compounds and signal intensity, predicted viscosities at low RH based on ESI and APPI measurements differ by 2–5 orders of magnitude. Complementary measurements of viscosity and chemical composition are desired to further constrain RH-dependent viscosity in future studies. 
    more » « less
  2. Absolute secondary organic aerosol (SOA) mass loading (CSOA) is a key parameter in determining partitioning of semi- and intermediate volatility compounds to the particle phase. Its impact on the phase state of SOA, however, has remained largely unexplored. In this study, systematic laboratory chamber measurements were performed to elucidate the influence of CSOA, ranging from 0.2 to 160 µg m−3, on the phase state of SOA formed by ozonolysis of various precursors, including α-pinene, limonene, cis-3-hexenyl acetate (CHA) and cis-3-hexen-1-ol (HXL). A previously established method to estimate SOA bounce factor (BF, a surrogate for particle viscosity) was utilized to infer particle viscosity as a function of CSOA. Results show that under nominally identical conditions, the maximum BF decreases by approximately 30% at higher CSOA, suggesting a more liquid phase state. With the exception of HXL-SOA (which acted as the negative control), the phase state for all studied SOA precursors varied as a function of CSOA. Furthermore, the BF was found to be the maximum when SOA particle distributions reached a geometric mean particle diameter of 50–60 nm. Experimental results indicate that CSOA is an important parameter impacting the phase state of SOA, reinforcing recent findings that extrapolation of experiments not conducted at atmospherically relevant SOA levels may not yield results that are relevant to the natural environment. 
    more » « less
  3. Abstract. Secondary organic aerosols (SOA) can exist in liquid, semi-solid or amorphous solid states, which are rarely accounted for in current chemical transport models (CTMs). Missing the information of SOA phase state and viscosity in CTMs impedes accurate representation of SOA formation and evolution, affecting the predictions of aerosol effects on air quality and climate. We have previously developed a method to estimate the glass transition temperature (Tg) of an organic compound based on volatility. In this study, we apply this method to predict the phase state and viscosity of SOA particles over China in summer of 2018 using the Weather Research and Forecasting model coupled to Chemistry (WRF-Chem). This is the first time that spatial distributions of the SOA phase state over China are investigated by a regional CTM. Simulations show that Tg values of dry SOA range from ~287 K to 305 K, with higher values in the northwestern China where SOA particles have larger mass fractions of low volatility compounds. Considering water uptake by SOA particles, the SOA viscosity also shows a prominent geospatial gradient that highly viscous or solid SOA particles are mainly found in the northwestern China. The lowest and highest SOA viscosity values both occur over the Qinghai-Tibet Plateau that the solid phase state is predicted over dry and high-altitude areas and the liquid phase state is predicted mainly in the south of the plateau with high relative humidity during the summer monsoon season. The characteristic mixing timescale of organic molecules in 200 nm SOA particles is calculated based on the simulated particle viscosity and the bulk diffusion coefficient of organic molecules. Calculations show that during the simulated period the percent time of the mixing timescale longer than 1 h is > 70 % at the surface and at 500 hPa in most areas of the northern China, indicating that kinetic partitioning considering the bulk diffusion in viscous particles may be required for more accurate prediction of SOA mass concentrations and size distributions over these areas. Sensitivity simulations show that including the formation of extremely low-volatile organic compounds, the percent time that a SOA particle is in the liquid phase state decreases by up to 12 % in the southeastern China during the simulated period. With an assumption that the organic and inorganic compounds are always internally mixed in one phase, we show that the water absorbed by inorganic species can significantly lower the simulated viscosity over the southeastern China. This indicates that constraining the uncertainties in simulated SOA volatility distributions and accurately predicting the occurrence of phase separation would improve prediction of viscosity in multicomponent particles in southeastern China. 
    more » « less
  4. null (Ed.)
    Molecular composition, viscosity, and liquid–liquid phase separation (LLPS) were investigated for secondary organic aerosol (SOA) derived from synthetic mixtures of volatile organic compounds (VOCs) representing emission profiles for Scots pine trees under healthy and aphid-herbivory stress conditions. Model “healthy plant SOA” and “stressed plant SOA” were generated in a 5 m 3 environmental smog chamber by photooxidation of the mixtures at 50% relative humidity (RH). SOA from photooxidation of α-pinene was also prepared for comparison. Molecular composition was determined with high resolution mass spectrometry, viscosity was determined with the poke-flow technique, and liquid–liquid phase separation was investigated with optical microscopy. The stressed plant SOA had increased abundance of higher molecular weight species, reflecting a greater fraction of sesquiterpenes in the stressed VOC mixture compared to the healthy plant VOC mixture. LLPS occurred in both the healthy and stressed plant SOA; however, stressed plant SOA exhibited phase separation over a broader humidity range than healthy plant SOA, with LLPS persisting down to 23 ± 11% RH. At RH ≤25%, both stressed and healthy plant SOA viscosity exceeded 10 8 Pa s, a value similar to that of tar pitch. At 40% and 50% RH, stressed plant SOA had the highest viscosity, followed by healthy plant SOA and then α-pinene SOA in descending order. The observed peak abundances in the mass spectra were also used to estimate the SOA viscosity as a function of RH and volatility. The predicted viscosity of the healthy plant SOA was lower than that of the stressed plant SOA driven by both the higher glass transition temperatures and lower hygroscopicity of the organic molecules making up stressed plant SOA. These findings suggest that plant stress influences the physicochemical properties of biogenic SOA. Furthermore, a complex mixture of VOCs resulted in a higher SOA viscosity compared to SOA generated from α-pinene alone at ≥25% RH, highlighting the importance of studying properties of SOA generated from more realistic multi-component VOC mixtures. 
    more » « less
  5. Abstract. Secondary organic aerosols (SOA) are major components of atmospheric fineparticulate matter, affecting climate and air quality. Mounting evidenceexists that SOA can adopt glassy and viscous semisolid states, impactingformation and partitioning of SOA. In this study, we apply the GECKO-A(Generator of Explicit Chemistry and Kinetics of Organics in the Atmosphere)model to conduct explicit chemical modeling of isoprene photooxidation andα-pinene ozonolysis and their subsequent SOA formation. The detailedgas-phase chemical schemes from GECKO-A are implemented into a box model andcoupled to our recently developed glass transition temperatureparameterizations, allowing us to predict SOA viscosity. The effects ofchemical composition, relative humidity, mass loadings and mass accommodation on particle viscosity are investigated in comparison withmeasurements of SOA viscosity. The simulated viscosity of isoprene SOAagrees well with viscosity measurements as a function of relative humidity,while the model underestimates viscosity of α-pinene SOA by a feworders of magnitude. This difference may be due to missing processes in themodel, including autoxidation and particle-phase reactions, leading to theformation of high-molar-mass compounds that would increase particleviscosity. Additional simulations imply that kinetic limitations of bulkdiffusion and reduction in mass accommodation coefficient may play a role inenhancing particle viscosity by suppressing condensation of semi-volatilecompounds. The developed model is a useful tool for analysis andinvestigation of the interplay among gas-phase reactions, particle chemicalcomposition and SOA phase state. 
    more » « less