skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 10:00 PM ET on Friday, February 6 until 10:00 AM ET on Saturday, February 7 due to maintenance. We apologize for the inconvenience.


Title: Protein target highlights in CASP15 : Analysis of models by structure providers
Abstract We present an in‐depth analysis of selected CASP15 targets, focusing on their biological and functional significance. The authors of the structures identify and discuss key protein features and evaluate how effectively these aspects were captured in the submitted predictions. While the overall ability to predict three‐dimensional protein structures continues to impress, reproducing uncommon features not previously observed in experimental structures is still a challenge. Furthermore, instances with conformational flexibility and large multimeric complexes highlight the need for novel scoring strategies to better emphasize biologically relevant structural regions. Looking ahead, closer integration of computational and experimental techniques will play a key role in determining the next challenges to be unraveled in the field of structural molecular biology.  more » « less
Award ID(s):
2025426
PAR ID:
10579053
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; more » ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; « less
Publisher / Repository:
Wiley
Date Published:
Journal Name:
Proteins: Structure, Function, and Bioinformatics
Volume:
91
Issue:
12
ISSN:
0887-3585
Page Range / eLocation ID:
1571 to 1599
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. ABSTRACT Accurate prediction of protein–peptide complex structures plays a critical role in structure‐based drug design, including antibody design. Most peptide‐docking benchmark studies were conducted using crystal structures of protein–peptide complexes; as such, the performance of the current peptide docking tools in the practical setting is unknown. Here, the practical setting implies there are no crystal or other experimental structures for the complex, nor for the receptor and peptide. In this work, we have developed a practical docking protocol that incorporated two famous machine learning models, AlphaFold 2 for structural prediction and ANI‐2x for ab initio potential prediction, to achieve a high success rate in modeling protein–peptide complex structures. The docking protocol consists of three major stages. In the first stage, the 3D structure of the receptor is predicted by AlphaFold 2 using the monomer mode, and that of the peptide is predicted by AlphaFold 2 using the multimer mode. We found that it is essential to include the receptor information to generate a high‐quality 3D structure of the peptide. In the second stage, rigid protein–peptide docking is performed using ZDOCK software. In the last stage, the top 10 docking poses are relaxed and refined by ANI‐2x in conjunction with our in‐house geometry optimization algorithm—conjugate gradient with backtracking line search (CG‐BS). CG‐BS was developed by us to more efficiently perform geometry optimization, which takes the potential and force directly from ANI‐2x machine learning models. The docking protocol achieved a very encouraging performance for a set of 62 very challenging protein–peptide systems which had an overall success rate of 34% if only the top 1 docking poses were considered. This success rate increased to 45% if the top 3 docking poses were considered. It is emphasized that this encouraging protein–peptide docking performance was achieved without using any crystal or experimental structures. 
    more » « less
  2. ABSTRACT Homology‐based protein domain classification is a powerful tool for gaining biological insights into protein function. This classification process has been significantly enhanced by the availability of experimental structures and high‐accuracy structural models generated by advanced tools such as AlphaFold. Our Evolutionary Classification of protein Domains (ECOD) database provides a continuously updated and refined domain classification system. Isolated (“orphan”) protein domain families, which have a limited distribution in the protein universe, present a unique challenge in this classification process. These families lack clear or identifiable evolutionary relationships with other sequence families. While some isolated domain families may have emerged through de novo evolution, others potentially share common evolutionary origins with existing domain families but represent difficult cases for traditional classification methods. In this study, we conducted a manual analysis of a set of isolated families of small domains in ECOD. By exploring sequence, structural, and functional evidence, we uncovered distant members and likely homologous relationships between different isolated domain families that were previously unrecognized. Our analysis provides valuable insights into the evolution of isolated domain families and has led to improved classification within ECOD. This work enhances our understanding of protein evolution and underscores the importance of continuous refinement in domain classification systems as new data and analytical methods become available. 
    more » « less
  3. Cowen, Lenore (Ed.)
    MotivationQuality assessment (QA) of predicted protein tertiary structure models plays an important role in ranking and using them. With the recent development of deep learning end-to-end protein structure prediction techniques for generating highly confident tertiary structures for most proteins, it is important to explore corresponding QA strategies to evaluate and select the structural models predicted by them since these models have better quality and different properties than the models predicted by traditional tertiary structure prediction methods. ResultsWe develop EnQA, a novel graph-based 3D-equivariant neural network method that is equivariant to rotation and translation of 3D objects to estimate the accuracy of protein structural models by leveraging the structural features acquired from the state-of-the-art tertiary structure prediction method—AlphaFold2. We train and test the method on both traditional model datasets (e.g. the datasets of the Critical Assessment of Techniques for Protein Structure Prediction) and a new dataset of high-quality structural models predicted only by AlphaFold2 for the proteins whose experimental structures were released recently. Our approach achieves state-of-the-art performance on protein structural models predicted by both traditional protein structure prediction methods and the latest end-to-end deep learning method—AlphaFold2. It performs even better than the model QA scores provided by AlphaFold2 itself. The results illustrate that the 3D-equivariant graph neural network is a promising approach to the evaluation of protein structural models. Integrating AlphaFold2 features with other complementary sequence and structural features is important for improving protein model QA. Availability and implementationThe source code is available at https://github.com/BioinfoMachineLearning/EnQA. Supplementary informationSupplementary data are available at Bioinformatics online. 
    more » « less
  4. Abstract The omicron variant of severe acute respiratory syndrome coronavirus 2 (SARS‐CoV‐2) characterized by 30 mutations in its spike protein, has rapidly spread worldwide since November 2021, significantly exacerbating the ongoing COVID‐19 pandemic. In order to investigate the relationship between these mutations and the variant's high transmissibility, we conducted a systematic analysis of the mutational effect on spike–angiotensin‐converting enzyme‐2 (ACE2) interactions and explored the structural/energy correlation of key mutations, utilizing a reliable coarse‐grained model. Our study extended beyond the receptor‐binding domain (RBD) of spike trimer through comprehensive modeling of the full‐length spike trimer rather than just the RBD. Our free‐energy calculation revealed that the enhanced binding affinity between the spike protein and the ACE2 receptor is correlated with the increased structural stability of the isolated spike protein, thus explaining the omicron variant's heightened transmissibility. The conclusion was supported by our experimental analyses involving the expression and purification of the full‐length spike trimer. Furthermore, the energy decomposition analysis established those electrostatic interactions make major contributions to this effect. We categorized the mutations into four groups and established an analytical framework that can be employed in studying future mutations. Additionally, our calculations rationalized the reduced affinity of the omicron variant towards most available therapeutic neutralizing antibodies, when compared with the wild type. By providing concrete experimental data and offering a solid explanation, this study contributes to a better understanding of the relationship between theories and observations and lays the foundation for future investigations. 
    more » « less
  5. Abstract Chaperones are a large family of proteins crucial for maintaining cellular protein homeostasis. One such chaperone is the 70 kDa heat shock protein (Hsp70), which plays a crucial role in protein (re)folding, stability, functionality, and translocation. While the key events in the Hsp70 chaperone cycle are well established, a relatively small number of distinct substrates were repetitively investigated. This is despite Hsp70 engaging with a plethora of cellular proteins of various structural properties and folding pathways. Here we analyzed novel Hsp70 substrates, based on tandem repeats of NanoLuc (Nluc), a small and highly bioluminescent protein with unique structural characteristics. In previous mechanical unfolding and refolding studies, we have identified interesting misfolding propensities of these Nluc‐based tandem repeats. In this study, we further investigate these properties through in vitro bulk experiments. Similar to monomeric Nluc, engineered Nluc dyads and triads proved to be highly bioluminescent. Using the bioluminescence signal as the proxy for their structural integrity, we determined that heat‐denatured Nluc dyads and triads can be efficiently refolded by theE. coliHsp70 chaperone system, which comprises DnaK, DnaJ, and GrpE. In contrast to previous studies with other substrates, we observed that Nluc repeats can be efficiently refolded by DnaK and DnaJ, even in the absence of GrpE co‐chaperone. Taken together, our study offers a new powerful substrate for chaperone research and raises intriguing questions about the Hsp70 mechanisms, particularly in the context of structurally diverse proteins. 
    more » « less