Ultrafast two-dimensional infrared (2DIR) spectroscopy is a relatively new methodology, which has now been widely used to study the molecular structure and dynamics of molecular processes occurring in solution. Typically, in 2DIR spectroscopy the dynamics of a system is inferred from the evolution of 2DIR spectral features over waiting times. One of the most important metrics derived from the 2DIR is the frequency–frequency correlation function (FFCF), which can be extracted using different methods, including center and nodal line slope. However, these methods struggle to correctly describe the dynamics in 2DIR spectra with multiple and overlapping transitions. Here, a new approach, utilizing pseudo-Zernike moments, is introduced to retrieve the FFCF dynamics of each spectral component from complex 2DIR spectra. The results show that this new method not only produces equivalent results to more established methodologies in simple spectra but also successfully extracts the FFCF dynamics of individual component from very congested and unresolved 2DIR spectra. In addition, this new methodology can be used to locate the individual frequency components from those complex spectra. Overall, a new methodology for analyzing the 2D spectra is presented here, which allows us to retrieve previously unattainable spectral features from the 2DIR spectra. 
                        more » 
                        « less   
                    
                            
                            Unraveling dynamic protein structures by two-dimensional infrared spectra with a pretrained machine learning model
                        
                    
    
            Dynamic protein structures are crucial for deciphering their diverse biological functions. Two-dimensional infrared (2DIR) spectroscopy stands as an ideal tool for tracing rapid conformational evolutions in proteins. However, linking spectral characteristics to dynamic structures poses a formidable challenge. Here, we present a pretrained machine learning model based on 2DIR spectra analysis. This model has learned signal features from approximately 204,300 spectra to establish a “spectrum-structure” correlation, thereby tracing the dynamic conformations of proteins. It excels in accurately predicting the dynamic content changes of various secondary structures and demonstrates universal transferability on real folding trajectories spanning timescales from microseconds to milliseconds. Beyond exceptional predictive performance, the model offers attention-based spectral explanations of dynamic conformational changes. Our 2DIR-based pretrained model is anticipated to provide unique insights into the dynamic structural information of proteins in their native environments. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 2246379
- PAR ID:
- 10614366
- Publisher / Repository:
- Proceedings of the National Academy of Sciences
- Date Published:
- Journal Name:
- Proceedings of the National Academy of Sciences
- Volume:
- 121
- Issue:
- 27
- ISSN:
- 0027-8424
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            Proteins perform their biological functions through motion. Although high throughput prediction of the three-dimensional static structures of proteins has proved feasible using deep-learning-based methods, predicting the conformational motions remains a challenge. Purely data-driven machine learning methods encounter difficulty for addressing such motions because available laboratory data on conformational motions are still limited. In this work, we develop a method for generating protein allosteric motions by integrating physical energy landscape information into deep-learning-based methods. We show that local energetic frustration, which represents a quantification of the local features of the energy landscape governing protein allosteric dynamics, can be utilized to empower AlphaFold2 (AF2) to predict protein conformational motions. Starting from ground state static structures, this integrative method generates alternative structures as well as pathways of protein conformational motions, using a progressive enhancement of the energetic frustration features in the input multiple sequence alignment sequences. For a model protein adenylate kinase, we show that the generated conformational motions are consistent with available experimental and molecular dynamics simulation data. Applying the method to another two proteins KaiB and ribose-binding protein, which involve large-amplitude conformational changes, can also successfully generate the alternative conformations. We also show how to extract overall features of the AF2 energy landscape topography, which has been considered by many to be black box. Incorporating physical knowledge into deep-learning-based structure prediction algorithms provides a useful strategy to address the challenges of dynamic structure prediction of allosteric proteins.more » « less
- 
            Two-dimensional infrared (2DIR) spectroscopy has become an established method for generating vibrational spectra in condensed phase samples composed of mixtures that yield heavily congested infrared and Raman spectra. These condensed phase 2DIR spectrometers can provide very high temporal resolution (<1 ps), but the spectral resolution is generally insufficient for resolving rotational peaks in gas phase spectra. Conventional (1D) rovibrational spectra of gas phase molecules are often plagued by severe spectral congestion, even when the sample is not a mixture. Spectral congestion can obscure the patterns in rovibrational spectra that are needed to assign peaks in the spectra. A method for generating high resolution 2DIR spectra of gas phase molecules has now been developed and tested using methane as the sample. The 2D rovibrational patterns that are recorded resemble an asterisk with a center position that provides the frequencies of both of the two coupled vibrational levels. The ability to generate easily recognizable 2D rovibrational patterns, regardless of temperature, should make the technique useful for a wide range of applications that are otherwise difficult or impossible when using conventional 1D rovibrational spectroscopy.more » « less
- 
            Chemically identical chlorophyll (Chl) molecules undergo conformational changes when they are embedded in a protein matrix. The conformational changes will modulate their absorption spectra to meet the need for programmed excitation energy transfer or electron transfer. To interpret spectroscopic data using the knowledge of pigment–protein interactions requires a single pigment embedded in one polypeptide matrix. Unfortunately, most of the known photosynthetic systems contain a set of multiple pigments in each protein subunit. This makes it complicated to interpret spectroscopic data using structural data due to the potential overlapping spectra of two or more pigments. Chl–protein interactions have not been systematically studied to answer three fundamental questions: (i) What are the structural characteristics and commonly shared substructures of different types of Chl molecules (e.g., Chl a, b, c, d, and f)? (ii) How many structural groups can Chl molecules be divided into and how are different structural groups influenced by their surrounding environments? (iii) What are the structural characteristics of pigment surrounding environments? Having no clear answers to the unresolved questions is probably due to a lack of computational methods for quantifying conformational changes in individual Chls and individual surrounding amino acids. The first version of the Triangular Spatial Relationship (TSR)-based method was developed for comparing protein 3D structures. The input data for the TSR-based method are experimentally determined 3D structures from the Protein Data Bank (PDB). In this study, we take advantage of the 3D structures of Chl-binding proteins deposited in the PDB and the TSR-based method to systematically investigate the 3D structures of various types of Chls and their protein environments. The key contributions of this study can be summarized as follows: (i) Specific structural characteristics of Chl d and f were identified and are defined using the TSR keys. (ii) Two and three clusters were found for various types of Chls and Chls a, respectively. The signature structures for distinguishing their corresponding two and three clusters were identified. (iii) Histidine residues were used as an example for revealing structural characteristics of Chl-binding sites. This study provides evidence for the three unresolved questions and builds a structural foundation through quantifying Chl conformations as well as structures of their embedded protein environments for future mechanistic understanding of relationships between Chl–protein interactions and their corresponding spectroscopic data.more » « less
- 
            Modeling protein–nucleic acid complexes with extremely large conformational changes using Flex‐LZerDAbstract Proteins and nucleic acids are key components in many processes in living cells, and interactions between proteins and nucleic acids are often crucial pathway components. In many cases, large flexibility of proteins as they interact with nucleic acids is key to their function. To understand the mechanisms of these processes, it is necessary to consider the 3D atomic structures of such protein–nucleic acid complexes. When such structures are not yet experimentally determined, protein docking can be used to computationally generate useful structure models. However, such docking has long had the limitation that the consideration of flexibility is usually limited to small movements or to small structures. We previously developed a method of flexible protein docking which could model ordered proteins which undergo large‐scale conformational changes, which we also showed was compatible with nucleic acids. Here, we elaborate on the ability of that pipeline, Flex‐LZerD, to model specifically interactions between proteins and nucleic acids, and demonstrate that Flex‐LZerD can model more interactions and types of conformational change than previously shown.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    