skip to main content


Title: Side-chain Packing Using SE(3)-Transformer
Predicting protein side-chains is important for both protein structure prediction and protein design. Modeling approaches to predict side-chains such as SCWRL4 have become one of the most widely used tools of its type due to fast and highly accurate predictions. Motivated by the recent success of AlphaFold2 in CASP14, our group adapted a 3D equivariant neural network architecture to predict protein side-chain conformations, specifically within a protein-protein interface, a problem that has not been fully addressed by AlphaFold2.  more » « less
Award ID(s):
1759472
NSF-PAR ID:
10379954
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
Pacific symposium on biocomputing
Volume:
27
ISSN:
2335-6928
Page Range / eLocation ID:
46-55
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. ResNet and, more recently, AlphaFold2 have demonstrated that deep neural networks can now predict a tertiary structure of a given protein amino-acid sequence with high accuracy. This seminal development will allow molecular biology researchers to advance various studies linking sequence, structure, and function. Many studies will undoubtedly focus on the impact of sequence mutations on stability, fold, and function. In this paper, we evaluate the ability of AlphaFold2 to predict accurate tertiary structures of wildtype and mutated sequences of protein molecules. We do so on a benchmark dataset in mutation modeling studies. Our empirical evaluation utilizes global and local structure analyses and yields several interesting observations. It shows, for instance, that AlphaFold2 performs similarly on wildtype and variant sequences. The placement of the main chain of a protein molecule is highly accurate. However, while AlphaFold2 reports similar confidence in its predictions over wildtype and variant sequences, its performance on placements of the side chains suffers in comparison to main-chain predictions. The analysis overall supports the premise that AlphaFold2-predicted structures can be utilized in further downstream tasks, but that further refinement of these structures may be necessary.

     
    more » « less
  2. Abstract

    AlphaFold2 has revolutionized protein structure prediction from amino‐acid sequence. In addition to protein structures, high‐resolution dynamics information about various protein regions is important for understanding protein function. Although AlphaFold2 has neither been designed nor trained to predict protein dynamics, it is shown here how the information returned by AlphaFold2 can be used to predict dynamic protein regions at the individual residue level. The approach, which is termed cdsAF2, uses the 3D protein structure returned by AlphaFold2 to predict backbone NMR NHS2order parameters using a local contact model that takes into account the contacts made by each peptide plane along the backbone with its environment. By combining for each residue AlphaFold2's pLDDT confidence score for the structure prediction accuracy with the predictedS2value using the local contact model, an estimator is obtained that semi‐quantitatively captures many of the dynamics features observed in experimental backbone NMR NHS2order parameter profiles. The method is demonstrated for a set nine proteins of different sizes and variable amounts of dynamics and disorder.

     
    more » « less
  3. Abstract

    Short hydrogen bonds (SHBs), whose donor and acceptor heteroatoms lie within 2.7 Å, exhibit prominent quantum mechanical characters and are connected to a wide range of essential biomolecular processes. However, exact determination of the geometry and functional roles of SHBs requires a protein to be at atomic resolution. In this work, we analyze 1260 high-resolution peptide and protein structures from the Protein Data Bank and develop a boosting based machine learning model to predict the formation of SHBs between amino acids. This model, which we name as machine learning assisted prediction of short hydrogen bonds (MAPSHB), takes into account 21 structural, chemical and sequence features and their interaction effects and effectively categorizes each hydrogen bond in a protein to a short or normal hydrogen bond. The MAPSHB model reveals that the type of the donor amino acid plays a major role in determining the class of a hydrogen bond and that the side chain Tyr-Asp pair demonstrates a significant probability of forming a SHB. Combining electronic structure calculations and energy decomposition analysis, we elucidate how the interplay of competing intermolecular interactions stabilizes the Tyr-Asp SHBs more than other commonly observed combinations of amino acid side chains. The MAPSHB model, which is freely available on our web server, allows one to accurately and efficiently predict the presence of SHBs given a protein structure with moderate or low resolution and will facilitate the experimental and computational refinement of protein structures.

     
    more » « less
  4. Abstract

    We present the structure of an engineered protein–protein interface between two beta barrel proteins, which is mediated by interactions between threonine (Thr) residues. This Thr zipper structure suggests that the protein interface is stabilized by close‐packing of the Thr residues, with only one intermonomer hydrogen bond (H‐bond) between two of the Thr residues. This Thr‐rich interface provides a unique opportunity to study the behavior of Thr in the context of many other Thr residues. In previous work, we have shown that the side chain (χ1) dihedral angles of interface and core Thr residues can be predicted with high accuracy using a hard sphere plus stereochemical constraint (HS) model. Here, we demonstrate that in the Thr‐rich local environment of the Thr zipper structure, we are able to predict theχ1dihedral angles of most of the Thr residues. Some, however, are not well predicted by the HS model. We therefore employed explicitly solvated molecular dynamics (MD) simulations to further investigate the side chain conformations of these residues. The MD simulations illustrate the role that transient H‐bonding to water, in combination with steric constraints, plays in determining the behavior of these Thr side chains.

    Broader Audience Statement: Protein–protein interactions are critical to life and the search for ways to disrupt adverse protein–protein interactions involved in disease is an ongoing area of drug discovery. We must better understand protein–protein interfaces, both to be able to disrupt existing ones and to engineer new ones for a variety of biotechnological applications. We have discovered and characterized an artificial Thr‐rich protein–protein interface. This novel interface demonstrates a heretofore unknown property of Thr‐rich surfaces: mediating protein–protein interactions.

     
    more » « less
  5. Abstract

    In the ligand prediction category of CASP15, the challenge was to predict the positions and conformations of small molecules binding to proteins that were provided as amino acid sequences or as models generated by the AlphaFold2 program. For most targets, we used our template‐based ligand docking program ClusPro ligTBM, also implemented as a public server available athttps://ligtbm.cluspro.org/. Since many targets had multiple chains and a number of ligands, several templates, and some manual interventions were required. In a few cases, no templates were found, and we had to use direct docking using the Glide program. Nevertheless, ligTBM was shown to be a very useful tool, and by any ranking criteria, our group was ranked among the top five best‐performing teams. In fact, all the best groups used template‐based docking methods. Thus, it appears that the AlphaFold2‐generated models, despite the high accuracy of the predicted backbone, have local differences from the x‐ray structure that make the use of direct docking methods more challenging. The results of CASP15 confirm that this limitation can be frequently overcome by homology‐based docking.

     
    more » « less