skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Rosetta custom score functions accurately predict ΔΔ G of mutations at protein–protein interfaces using machine learning
Protein–protein interfaces play essential roles in a variety of biological processes and many therapeutic molecules are targeted at these interfaces. However, accurate predictions of the effects of interfacial mutations to identify “hotspots” have remained elusive despite the myriad of modeling and machine learning methods tested. Here, for the first time, we demonstrate that nonlinear reweighting of energy terms from Rosetta, through the use of machine learning, exhibits improved predictability of ΔΔ G values associated with interfacial mutations.  more » « less
Award ID(s):
1708759
PAR ID:
10172056
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Chemical Communications
Volume:
56
Issue:
50
ISSN:
1359-7345
Page Range / eLocation ID:
6774 to 6777
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Protein structures at solid/liquid interfaces mediate interfacial protein functions, which are important for many applications. It is difficult to probe interfacial protein structures at buried solid/liquid interfaces in situ at the molecular level. Here, a systematic methodology to determine protein molecular structures (orientation and conformation) at buried solid/liquid interfaces in situ was successfully developed with a combined approach using a nonlinear optical spectroscopic technique – sum frequency generation (SFG) vibrational spectroscopy, isotope labeling, spectra calculation, and computer simulation. With this approach, molecular structures of protein GB1 and its mutant (with two amino acids mutated) were investigated at the polymer/solution interface. Markedly different orientations and similar (but not identical) conformations of the wild-type protein GB1 and its mutant at the interface were detected, due to the varied molecular interfacial interactions. This systematic strategy is general and can be widely used to elucidate protein structures at buried interfaces in situ . 
    more » « less
  2. Proteins play a central role in biology from immune recognition to brain activity. While major advances in machine learning have improved our ability to predict protein structure from sequence, determining protein function from its sequence or structure remains a major challenge. Here, we introduce holographic convolutional neural network (H-CNN) for proteins, which is a physically motivated machine learning approach to model amino acid preferences in protein structures. H-CNN reflects physical interactions in a protein structure and recapitulates the functional information stored in evolutionary data. H-CNN accurately predicts the impact of mutations on protein stability and binding of protein complexes. Our interpretable computational model for protein structure–function maps could guide design of novel proteins with desired function. 
    more » « less
  3. A comprehensive understanding of the interfacial behaviors of biomolecules holds great significance in the development of biomaterials and biosensing technologies. In this work, we used discontinuous molecular dynamics (DMD) simulations and graphic contrastive learning analysis to study the adsorption of ubiquitin protein on a graphene surface. Our high-throughput DMD simulations can explore the whole protein adsorption process including the protein structural evolution with sufficient accuracy. Contrastive learning was employed to train a protein contact map feature extractor aiming at generating contact map feature vectors. Subsequently, these features were grouped using the k-means clustering algorithm to identify the protein structural transition stages throughout the adsorption process. The machine learning analysis can illustrate the dynamics of protein structural changes, including the pathway and the rate-limiting step. Our study indicated that the protein–graphene surface hydrophobic interactions and the π–π stacking were crucial to the seven-stage adsorption process. Upon adsorption, the secondary structure and tertiary structure of ubiquitin disintegrated. The unfolding stages obtained by contrastive learning-based algorithm were not only consistent with the detailed analyses of protein structures but also provided more hidden information about the transition states and pathway of protein adsorption process and structural dynamics. Our combination of efficient DMD simulations and machine learning analysis could be a valuable approach to studying the interfacial behaviors of biomolecules. 
    more » « less
  4. Abstract Structures of proteins and protein–protein complexes are determined by the same physical principles and thus share a number of similarities. At the same time, there could be differences because in order to function, proteins interact with other molecules, undergo conformations changes, and so forth, which might impose different restraints on the tertiary versus quaternary structures. This study focuses on structural properties of protein–protein interfaces in comparison with the protein core, based on the wealth of currently available structural data and new structure‐based approaches. The results showed that physicochemical characteristics, such as amino acid composition, residue–residue contact preferences, and hydrophilicity/hydrophobicity distributions, are similar in protein core and protein–protein interfaces. On the other hand, characteristics that reflect the evolutionary pressure, such as structural composition and packing, are largely different. The results provide important insight into fundamental properties of protein structure and function. At the same time, the results contribute to better understanding of the ways to dock proteins. Recent progress in predicting structures of individual proteins follows the advancement of deep learning techniques and new approaches to residue coevolution data. Protein core could potentially provide large amounts of data for application of the deep learning to docking. However, our results showed that the core motifs are significantly different from those at protein–protein interfaces, and thus may not be directly useful for docking. At the same time, such difference may help to overcome a major obstacle in application of the coevolutionary data to docking—discrimination of the intramolecular information not directly relevant to docking. 
    more » « less
  5. Scaffold proteins play crucial roles in subcellular organization and function. In many organisms, proteins with multiple Tudor domains are required for the assembly of membraneless RNA–protein organelles (germ granules) in germ cells. Tudor domains are protein–protein interaction modules which bind to methylated polypeptides.DrosophilaTudor protein contains 11 Tudor domains, which is the highest number known in a single protein. The role of each of these domains in germ cell formation has not been systematically tested, and it is not clear if some domains are functionally redundant. Using CRISPR methodology, we generated mutations in several uncharacterized Tudor domains and showed that they all caused defects in germ cell formation. Mutations in individual domains affected Tudor protein differently, causing reduction in protein levels and defects in subcellular localization and in the assembly of germ granules. Our data suggest that multiple domains of Tudor protein are all needed for efficient germ cell formation, highlighting the rational for keeping many Tudor domains in protein scaffolds of biomolecular condensates inDrosophilaand other organisms. 
    more » « less