skip to main content


Title: ComparePD : Improving protein– DNA complex model comparison with hydrogen bond energy‐based metrics
Abstract

Computational modeling of protein–DNA complex structures has important implications in biomedical applications such as structure‐based, computer aided drug design. A key step in developing methods for accurate modeling of protein–DNA complexes is similarity assessment between models and their reference complex structures. Existing methods primarily rely on distance‐based metrics and generally do not consider important functional features of the complexes, such as interface hydrogen bonds that are critical to specific protein–DNA interactions. Here, we present a new scoring function, ComparePD, which takes interface hydrogen bond energy and strength into account besides the distance‐based metrics for accurate similarity measure of protein–DNA complexes. ComparePD was tested on two datasets of computational models of protein–DNA complexes generated using docking (classified as easy, intermediate, and difficult cases) and homology modeling methods. The results were compared with PDDockQ, a modified version of DockQ tailored for protein–DNA complexes, as well as the metrics employed by the community‐wide experiment CAPRI (Critical Assessment of PRedicted Interactions). We demonstrated that ComparePD provides an improved similarity measure over PDDockQ and the CAPRI classification method by considering both conformational similarity and functional importance of the complex interface. ComparePD identified more meaningful models as compared to PDDockQ for all the cases having different top models between ComparePD and PDDockQ except for one intermediate docking case.

 
more » « less
Award ID(s):
2051491
NSF-PAR ID:
10430691
Author(s) / Creator(s):
 ;  
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
Proteins: Structure, Function, and Bioinformatics
Volume:
91
Issue:
8
ISSN:
0887-3585
Page Range / eLocation ID:
p. 1077-1088
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Physical interactions of proteins play key functional roles in many important cellular processes. To understand molecular mechanisms of such functions, it is crucial to determine the structure of protein complexes. To complement experimental approaches, which usually take a considerable amount of time and resources, various computational methods have been developed for predicting the structures of protein complexes. In computational modeling, one of the challenges is to identify near-native structures from a large pool of generated models. Here, we developed a deep learning–based approach named Graph Neural Network–based DOcking decoy eValuation scorE (GNN-DOVE). To evaluate a protein docking model, GNN-DOVE extracts the interface area and represents it as a graph. The chemical properties of atoms and the inter-atom distances are used as features of nodes and edges in the graph, respectively. GNN-DOVE was trained, validated, and tested on docking models in the Dockground database and further tested on a combined dataset of Dockground and ZDOCK benchmark as well as a CAPRI scoring dataset. GNN-DOVE performed better than existing methods, including DOVE, which is our previous development that uses a convolutional neural network on voxelized structure models. 
    more » « less
  2. Abstract

    Critical Assessment of PRediction of Interactions (CAPRI) rounds 37 through 45 introduced larger complexes, new macromolecules, and multistage assemblies. For these rounds, we used and expanded docking methods in Rosetta to model 23 target complexes. We successfully predicted 14 target complexes and recognized and refined near‐native models generated by other groups for two further targets. Notably, for targets T110 and T136, we achieved the closest prediction of any CAPRI participant. We created several innovative approaches during these rounds. Since round 39 (target 122), we have used the new RosettaDock 4.0, which has a revamped coarse‐grained energy function and the ability to perform conformer selection during docking with hundreds of pregenerated protein backbones. Ten of the complexes had some degree of symmetry in their interactions, so we tested Rosetta SymDock, realized its shortcomings, and developed the next‐generation symmetric docking protocol, SymDock2, which includes docking of multiple backbones and induced‐fit refinement. Since the last CAPRI assessment, we also developed methods for modeling and designing carbohydrates in Rosetta, and we used them to successfully model oligosaccharide‐protein complexes in round 41. Although the results were broadly encouraging, they also highlighted the pressing need to invest in (a) flexible docking algorithms with the ability to model loop and linker motions and in (b) new sampling and scoring methods for oligosaccharide‐protein interactions.

     
    more » « less
  3. ABSTRACT

    Predicting protein conformational changes from unbound structures or even homology models to bound structures remains a critical challenge for protein docking. Here we present a study directly addressing the challenge by reducing the dimensionality and narrowing the range of the corresponding conformational space. The study builds on cNMA—our new framework of partner‐ and contact‐specific normal mode analysis that exploits encounter complexes and considers both intrinsic and induced flexibility. First, we established over a CAPRI (Critical Assessment of PRedicted Interactions) target set that the direction of conformational changes from unbound structures and homology models can be reproduced to a great extent by a small set of cNMA modes. In particular, homology‐to‐bound interface root‐mean‐square deviation (iRMSD) can be reduced by 40% on average with the slowest 30 modes. Second, we developed novel and interpretable features from cNMA and used various machine learning approaches to predict the extent of conformational changes. The models learned from a set of unbound‐to‐bound conformational changes could predict the actual extent of iRMSD with errors around 0.6 Å for unbound proteins in a held‐out benchmark subset, around 0.8 Å for unbound proteins in the CAPRI set, and around 1 Å even for homology models in the CAPRI set. Our results shed new insights into origins of conformational differences between homology models and bound structures and provide new support for the low‐dimensionality of conformational adjustment during protein associations. The results also provide new tools for ensemble generation and conformational sampling in unbound and homology docking. Proteins 2017; 85:544–556. © 2016 Wiley Periodicals, Inc.

     
    more » « less
  4. Abstract

    Targets in the protein docking experiment CAPRI (Critical Assessment of Predicted Interactions) generally present new challenges and contribute to new developments in methodology. In rounds 38 to 45 of CAPRI, most targets could be effectively predicted using template‐based methods. However, the server ClusPro required structures rather than sequences as input, and hence we had to generate and dock homology models. The available templates also provided distance restraints that were directly used as input to the server. We show here that such an approach has some advantages. Free docking with template‐based restraints using ClusPro reproduced some interfaces suggested by weak or ambiguous templates while not reproducing others, resulting in correct server predicted models. More recently we developed the fully automated ClusPro TBM server that performs template‐based modeling and thus can use sequences rather than structures of component proteins as input. The performance of the server, freely available for noncommercial use athttps://tbm.cluspro.org, is demonstrated by predicting the protein‐protein targets of rounds 38 to 45 of CAPRI.

     
    more » « less
  5. ABSTRACT

    We report the performance of protein–protein docking predictions by our group for recent rounds of the Critical Assessment of Prediction of Interactions (CAPRI), a community‐wide assessment of state‐of‐the‐art docking methods. Our prediction procedure uses a protein–protein docking program named LZerD developed in our group. LZerD represents a protein surface with 3D Zernike descriptors (3DZD), which are based on a mathematical series expansion of a 3D function. The appropriate soft representation of protein surface with 3DZD makes the method more tolerant to conformational change of proteins upon docking, which adds an advantage for unbound docking. Docking was guided by interface residue prediction performed with BindML and cons‐PPISP as well as literature information when available. The generated docking models were ranked by a combination of scoring functions, including PRESCO, which evaluates the native‐likeness of residues' spatial environments in structure models. First, we discuss the overall performance of our group in the CAPRI prediction rounds and investigate the reasons for unsuccessful cases. Then, we examine the performance of several knowledge‐based scoring functions and their combinations for ranking docking models. It was found that the quality of a pool of docking models generated by LZerD, that is whether or not the pool includes near‐native models, can be predicted by the correlation of multiple scores. Although the current analysis used docking models generated by LZerD, findings on scoring functions are expected to be universally applicable to other docking methods. Proteins 2017; 85:513–527. © 2016 Wiley Periodicals, Inc.

     
    more » « less