skip to main content

Title: A neural network protocol for electronic excitations of N -methylacetamide
UV absorption is widely used for characterizing proteins structures. The mapping of UV spectra to atomic structure of proteins relies on expensive theoretical simulations, circumventing the heavy computational cost which involves repeated quantum-mechanical simulations of excited-state properties of many fluctuating protein geometries, which has been a long-time challenge. Here we show that a neural network machine-learning technique can predict electronic absorption spectra of N -methylacetamide (NMA), which is a widely used model system for the peptide bond. Using ground-state geometric parameters and charge information as descriptors, we employed a neural network to predict transition energies, ground-state, and transition dipole moments of many molecular-dynamics conformations at different temperatures, in agreement with time-dependent density-functional theory calculations. The neural network simulations are nearly 3,000× faster than comparable quantum calculations. Machine learning should provide a cost-effective tool for simulating optical properties of proteins.
; ; ; ; ; ; ; ;
Award ID(s):
Publication Date:
Journal Name:
Proceedings of the National Academy of Sciences
Page Range or eLocation-ID:
Sponsoring Org:
National Science Foundation
More Like this
  1. The energy of the lowest-lying triplet state (T1) relative to the ground and first-excited singlet states (S0, S1) plays a critical role in optical multiexcitonic processes of organic chromophores. Focusing on triplet–triplet annihilation (TTA) upconversion, the S0 to T1 energy gap, known as the triplet energy, is difficult to measure experimentally for most molecules of interest. Ab initio predictions can provide a useful alternative, however low-scaling electronic structure methods such as the Kohn–Sham and time-dependent variants of Density Functional Theory (DFT) rely heavily on the fraction of exact exchange chosen for a given functional, and tend to be unreliable when strong electronic correlation is present. Here, we use auxiliary-field quantum Monte Carlo (AFQMC), a scalable electronic structure method capable of accurately describing even strongly correlated molecules, to predict the triplet energies for a series of candidate annihilators for TTA upconversion, including 9,10 substituted anthracenes and substituted benzothiadiazole (BTD) and benzoselenodiazole (BSeD) compounds. We compare our results to predictions from a number of commonly used DFT functionals, as well as DLPNO-CCSD(T 0 ), a localized approximation to coupled cluster with singles, doubles, and perturbative triples. Together with S1 estimates from absorption/emission spectra, which are well-reproduced by TD-DFT calculations employing the range-correctedmore »hybrid functional CAM-B3LYP, we provide predictions regarding the thermodynamic feasibility of upconversion by requiring (a) the measured T1 of the sensitizer exceeds that of the calculated T1 of the candidate annihilator, and (b) twice the T1 of the annihilator exceeds its S1 energetic value. We demonstrate a successful example of in silico discovery of a novel annihilator, phenyl-substituted BTD, and present experimental validation via low temperature phosphorescence and the presence of upconverted blue light emission when coupled to a platinum octaethylporphyrin (PtOEP) sensitizer. The BTD framework thus represents a new class of annihilators for TTA upconversion. Its chemical functionalization, guided by the computational tools utilized herein, provides a promising route towards high energy (violet to near-UV) emission.« less
  2. Photochemical reactions are widely used by academic and industrial researchers to construct complex molecular architectures via mechanisms that often require harsh reaction conditions. Photodynamics simulations provide time-resolved snapshots of molecular excited-state structures required to understand and predict reactivities and chemoselectivities. Molecular excited-states are often nearly degenerate and require computationally intensive multiconfigurational quantum mechanical methods, especially at conical intersections. Non-adiabatic molecular dynamics require thousands of these computations per trajectory, which limits simulations to ∼1 picosecond for most organic photochemical reactions. Westermayr et al. recently introduced a neural-network-based method to accelerate the predictions of electronic properties and pushed the simulation limit to 1 ns for the model system, methylenimmonium cation (CH 2 NH 2 + ). We have adapted this methodology to develop the Python-based, Python Rapid Artificial Intelligence Ab Initio Molecular Dynamics (PyRAI 2 MD) software for the cis – trans isomerization of trans -hexafluoro-2-butene and the 4π-electrocyclic ring-closing of a norbornyl hexacyclodiene. We performed a 10 ns simulation for trans -hexafluoro-2-butene in just 2 days. The same simulation would take approximately 58 years with traditional multiconfigurational photodynamics simulations. We generated training data by combining Wigner sampling, geometrical interpolations, and short-time quantum chemical trajectories to adaptively sample sparse data regions alongmore »reaction coordinates. The final data set of the cis – trans isomerization and the 4π-electrocyclic ring-closing model has 6207 and 6267 data points, respectively. The training errors in energy using feedforward neural networks achieved chemical accuracy (0.023–0.032 eV). The neural network photodynamics simulations of trans -hexafluoro-2-butene agree with the quantum chemical calculations showing the formation of the cis -product and reactive carbene intermediate. The neural network trajectories of the norbornyl cyclohexadiene corroborate the low-yielding syn -product, which was absent in the quantum chemical trajectories, and revealed subsequent thermal reactions in 1 ns.« less
  3. A two-step route to strongly absorbing and efficiently orange to deep red fluorescent, doubly B/N-doped, ladder-type pyrrolo[3,2- b ]pyrroles has been developed. We synthesize and study a series of derivatives of these four-coordinate boron-containing, nominally quadrupolar materials, which mostly exhibit one-photon absorption in the 500–600 nm range with the peak molar extinction coefficients reaching 150 000, and emission in the 520–670 nm range with the fluorescence quantum yields reaching 0.90. Within the family of these ultrastable dyes even small structural changes lead to significant variations of the photophysical properties, in some cases attributed to reversal of energy ordering of alternate-parity excited electronic states. Effective preservation of ground-state inversion symmetry was evidenced by very weak two-photon absorption (2PA) at excitation wavelengths corresponding to the lowest-energy, strongly one-photon allowed purely electronic transition. π-Expanded derivatives and those possessing electron-donating groups showed the most red-shifted absorption- and emission spectra, while displaying remarkably high peak 2PA cross-section ( σ 2PA ) values reaching ∼2400 GM at around 760 nm, corresponding to a two-photon allowed higher-energy excited state. At the same time, derivatives lacking π-expansion were found to have a relatively weak 2PA peak centered at ca. 800–900 nm with the maximum σ 2PA ∼50–250 GM. Ourmore »findings are augmented by theoretical calculations performed using TD-DFT method, which reproduce the main experimental trends, including the 2PA, in a nearly quantitative manner. Electrochemical studies revealed that the HOMO of the new dyes is located at ca . −5.35 eV making them relatively electron rich in spite of the presence of two B − –N + dative bonds. These dyes undergo a fully reversible first oxidation, located on the diphenylpyrrolo[3,2- b ]pyrrole core, directly to the di(radical cation) stage.« less
  4. Nuclear magnetic resonance (NMR) is one of the primary techniques used to elucidate the chemical structure, bonding, stereochemistry, and conformation of organic compounds. The distinct chemical shifts in an NMR spectrum depend upon each atom's local chemical environment and are influenced by both through-bond and through-space interactions with other atoms and functional groups. The in silico prediction of NMR chemical shifts using quantum mechanical (QM) calculations is now commonplace in aiding organic structural assignment since spectra can be computed for several candidate structures and then compared with experimental values to find the best possible match. However, the computational demands of calculating multiple structural- and stereo-isomers, each of which may typically exist as an ensemble of rapidly-interconverting conformations, are expensive. Additionally, the QM predictions themselves may lack sufficient accuracy to identify a correct structure. In this work, we address both of these shortcomings by developing a rapid machine learning (ML) protocol to predict 1 H and 13 C chemical shifts through an efficient graph neural network (GNN) using 3D structures as input. Transfer learning with experimental data is used to improve the final prediction accuracy of a model trained using QM calculations. When tested on the CHESHIRE dataset, the proposed modelmore »predicts observed 13 C chemical shifts with comparable accuracy to the best-performing DFT functionals (1.5 ppm) in around 1/6000 of the CPU time. An automated prediction webserver and graphical interface are accessible online at We further demonstrate the model in three applications: first, we use the model to decide the correct organic structure from candidates through experimental spectra, including complex stereoisomers; second, we automatically detect and revise incorrect chemical shift assignments in a popular NMR database, the NMRShiftDB; and third, we use NMR chemical shifts as descriptors for determination of the sites of electrophilic aromatic substitution.« less
  5. Cyanobacteriochromes (CBCRs) are promising optogenetic tools for their diverse absorption properties with a single compact cofactor-binding domain. We previously uncovered the ultrafast reversible photoswitching dynamics of a red/green photoreceptor AnPixJg2, which binds phycocyanobilin (PCB) that is unavailable in mammalian cells. Biliverdin (BV) is a mammalian cofactor with a similar structure to PCB but exhibits redder absorption. To improve the AnPixJg2 feasibility in mammalian applications, AnPixJg2_BV4 with only four mutations has been engineered to incorporate BV. Herein, we implemented femtosecond transient absorption (fs-TA) and ground state femtosecond stimulated Raman spectroscopy (GS-FSRS) to uncover transient electronic dynamics on molecular time scales and key structural motions responsible for the photoconversion of AnPixJg2_BV4 with PCB (Bpcb) and BV (Bbv) cofactors in comparison with the parent AnPixJg2 (Apcb). Bpcb adopts the same photoconversion scheme as Apcb, while BV4 mutations create a less bulky environment around the cofactor D ring that promotes a faster twist. The engineered Bbv employs a reversible clockwise/counterclockwise photoswitching that requires a two-step twist on ~5 and 35 picosecond (ps) time scales. The primary forward Pfr → Po transition displays equal amplitude weights between the two processes before reaching a conical intersection. In contrast, the primary reverse Po → Pfr transition showsmore »a 2:1 weight ratio of the ~35 ps over 5 ps component, implying notable changes to the D-ring-twisting pathway. Moreover, we performed pre-resonance GS-FSRS and quantum calculations to identify the Bbv vibrational marker bands at ~659,797, and 1225 cm−1. These modes reveal a stronger H-bonding network around the BV cofactor A ring with BV4 mutations, corroborating the D-ring-dominant reversible photoswitching pathway in the excited state. Implementation of BV4 mutations in other PCB-binding GAF domains like AnPixJg4, AM1_1870g3, and NpF2164g5 could promote similar efficient reversible photoswitching for more directional bioimaging and optogenetic applications, and inspire other bioengineering advances.« less