skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Engineering gain‐of‐function mutants of a WW domain by dynamics and structural analysis
Proteins gain optimal fitness such as foldability and function through evolutionary selection. However, classical studies have found that evolutionarily designed protein sequences alone cannot guarantee foldability, or at least not without considering local contacts associated with the initial folding steps. We previously showed that foldability and function can be restored by removing frustration in the folding energy landscape of a model WW domain protein, CC16, which was designed based on Statistical Coupling Analysis (SCA). Substitutions ensuring the formation of five local contacts identified as “on‐path” were selected using the closest homolog native folded sequence, N21. Surprisingly, the resulting sequence, CC16‐N21, bound to Group I peptides, while N21 did not. Here, we identified single‐point mutations that enable N21 to bind a Group I peptide ligand through structure and dynamic‐based computational design. Comparison of the docked position of the CC16‐N21/ligand complex with the N21 structure showed that residues at positions 9 and 19 are important for peptide binding, whereas the dynamic profiles identified position 10 as allosterically coupled to the binding site and exhibiting different dynamics between N21 and CC16‐N21. We found that swapping these positions in N21 with matched residues from CC16‐N21 recovers nature‐like binding affinity to N21. This study validates the use of dynamic profiles as guiding principles for affecting the binding affinity of small proteins.  more » « less
Award ID(s):
1901709
PAR ID:
10483563
Author(s) / Creator(s):
; ; ; ; ; ;
Publisher / Repository:
Wiley
Date Published:
Journal Name:
Protein Science
Volume:
32
Issue:
9
ISSN:
0961-8368
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. ABSTRACT Predicting the structure of ligands bound to proteins is a foundational problem in modern biotechnology and drug discovery, yet little is known about how to combine the predictions of protein‐ligand structure (poses) produced by the latest deep learning methods to identify the best poses and how to accurately estimate the binding affinity between a protein target and a list of ligand candidates. Further, a blind benchmarking and assessment of protein‐ligand structure and binding affinity prediction is necessary to ensure it generalizes well to new settings. Towards this end, we introduceMULTICOM_ligand, a deep learning‐based protein‐ligand structure and binding affinity prediction ensemble featuring structural consensus ranking for unsupervised pose ranking and a new deep generative flow matching model for joint structure and binding affinity prediction. Notably,MULTICOM_ligand ranked among the top‐5 ligand prediction methods in both protein‐ligand structure prediction and binding affinity prediction in the 16th Critical Assessment of Techniques for Structure Prediction (CASP16), demonstrating its efficacy and utility for real‐world drug discovery efforts. The source code for MULTICOM_ligand is freely available on GitHub. 
    more » « less
  2. Abstract Chaperones are essential to the co-translational folding of most proteins. However, the principles of co-translational chaperone interaction throughout the proteome are poorly understood, as current methods are restricted to few substrates and cannot capture nascent protein folding or chaperone binding sites, precluding a comprehensive understanding of productive and erroneous protein biosynthesis. Here, by integrating genome-wide selective ribosome profiling, single-molecule tools, and computational predictions using AlphaFold we show that the binding of the mainE. colichaperones involved in co-translational folding, Trigger Factor (TF) and DnaK correlates with “unsatisfied residues” exposed on nascent partial folds – residues that have begun to form tertiary structure but cannot yet form all native contacts due to ongoing translation. This general principle allows us to predict their co-translational binding across the proteome based on sequence only, which we verify experimentally. The results show that TF and DnaK stably bind partially folded rather than unfolded conformers. They also indicate a synergistic action of TF guiding intra-domain folding and DnaK preventing premature inter-domain contacts, and reveal robustness in the larger chaperone network (TF, DnaK, GroEL). Given the complexity of translation, folding, and chaperone functions, our predictions based on general chaperone binding rules indicate an unexpected underlying simplicity. 
    more » « less
  3. Protein–DNA interactions play an important role in various biological processes such as gene expression, replication, and transcription. Understanding the important features that dictate the binding affinity of protein-DNA complexes and predicting their affinities is important for elucidating their recognition mechanisms. In this work, we have collected the experimental binding free energy (ΔG) for a set of 391 Protein-DNA complexes and derived several structure-based features such as interaction energy, contact potentials, volume and surface area of binding site residues, base step parameters of the DNA and contacts between different types of atoms. Our analysis on relationship between binding affinity and structural features revealed that the important factors mainly depend on the number of DNA strands as well as functional and structural classes of proteins. Specifically, binding site properties such as number of atom contacts between the DNA and protein, volume of protein binding sites and interaction-based features such as interaction energies and contact potentials are important to understand the binding affinity. Further, we developed multiple regression equations for predicting the binding affinity of protein-DNA complexes belonging to different structural and functional classes. Our method showed an average correlation and mean absolute error of 0.78 and 0.98 kcal/mol, respectively, between the experimental and predicted binding affinities on a jack-knife test. We have developed a webserver, PDA-PreD (Protein-DNA Binding affinity predictor), for predicting the affinity of protein-DNA complexes and it is freely available at https://web.iitm.ac.in/bioinfo2/pdapred/ 
    more » « less
  4. Middle East Respiratory Syndrome Coronavirus (MERS-CoV) causes severe pneumonia-like symptoms and is still pose a significant threat to global public health. A key component in the virulence of MERS-CoV is the Spike (S) protein, which binds with the host membrane receptor dipeptidyl peptidase 4 (DPP4). The goal of the present investigation is to examine the effects of missense mutations in the MERS-CoV S protein on protein stability and binding affinity with DPP4 to provide insight that is useful in developing vaccines to prevent coronavirus infection. We utilized a saturation mutagenesis approach to simulate all possible mutations in the MERS-CoV full-length S, S Receptor Binding Domain (RBD) and DPP4. We found the mutations in MERS-CoV S protein residues, G552, C503, C526, N468, G570, S532, S451, S419, S465, and S435, affect protein stability. We identified key residues, G538, E513, V555, S557, L506, L507, R511, M452, D537, and S454 in the S protein RBD region are important in the binding of MERS-CoV S protein to the DPP4 receptor. We investigated the effects of MERS-CoV S protein viral mutations on protein stability and binding affinity. In addition, we studied all DPP4 mutations and found the functional substitution R336T weakens both DPP4 protein stability and S-DPP4 binding affinity. We compared the S protein structures of MERS-CoV, SARS-CoV, and SARS-CoV-2 viruses and identified the residues like C526, C383, and N468 located in equivalent positions of these viruses have effects on S protein structure. These findings provide further information on how mutations in coronavirus S proteins effect protein function. 
    more » « less
  5. Abstract Designing protein-binding proteins is critical for drug discovery. However, artificial-intelligence-based design of such proteins is challenging due to the complexity of protein–ligand interactions, the flexibility of ligand molecules and amino acid side chains, and sequence–structure dependencies. We introduce PocketGen, a deep generative model that produces residue sequence and atomic structure of the protein regions in which ligand interactions occur. PocketGen promotes consistency between protein sequence and structure by using a graph transformer for structural encoding and a sequence refinement module based on a protein language model. The graph transformer captures interactions at multiple scales, including atom, residue and ligand levels. For sequence refinement, PocketGen integrates a structural adapter into the protein language model, ensuring that structure-based predictions align with sequence-based predictions. PocketGen can generate high-fidelity protein pockets with enhanced binding affinity and structural validity. It operates ten times faster than physics-based methods and achieves a 97% success rate, defined as the percentage of generated pockets with higher binding affinity than reference pockets. Additionally, it attains an amino acid recovery rate exceeding 63%. 
    more » « less