Computational modeling of protein–DNA complex structures has important implications in biomedical applications such as structure‐based, computer aided drug design. A key step in developing methods for accurate modeling of protein–DNA complexes is similarity assessment between models and their reference complex structures. Existing methods primarily rely on distance‐based metrics and generally do not consider important functional features of the complexes, such as interface hydrogen bonds that are critical to specific protein–DNA interactions. Here, we present a new scoring function, ComparePD, which takes interface hydrogen bond energy and strength into account besides the distance‐based metrics for accurate similarity measure of protein–DNA complexes. ComparePD was tested on two datasets of computational models of protein–DNA complexes generated using docking (classified as easy, intermediate, and difficult cases) and homology modeling methods. The results were compared with PDDockQ, a modified version of DockQ tailored for protein–DNA complexes, as well as the metrics employed by the community‐wide experiment CAPRI (Critical Assessment of PRedicted Interactions). We demonstrated that ComparePD provides an improved similarity measure over PDDockQ and the CAPRI classification method by considering both conformational similarity and functional importance of the complex interface. ComparePD identified more meaningful models as compared to PDDockQ for all the cases having different top models between ComparePD and PDDockQ except for one intermediate docking case.
An important question is how well the models submitted to CASP retain the properties of target structures. We investigate several properties related to binding. First we explore the binding of small molecules as probes, and count the number of interactions between each residue and such probes, resulting in a binding fingerprint. The similarity between two fingerprints, one for the X‐ray structure and the other for a model, is determined by calculating their correlation coefficient. The fingerprint similarity weakly correlates with global measures of accuracy, and GDT_TS higher than 80 is a necessary but not sufficient condition for the conservation of surface binding properties. The advantage of this approach is that it can be carried out without information on potential ligands and their binding sites. The latter information was available for a few targets, and we explored whether the CASP14 models can be used to predict binding sites and to dock small ligands. Finally, we tested the ability of models to reproduce protein–protein interactions by docking both the X‐ray structures and the models to their interaction partners in complexes. The analysis showed that in CASP14 the quality of individual domain models is approaching that offered by X‐ray crystallography, and hence such models can be successfully used for the identification of binding and regulatory sites, as well as for assembling obligatory protein–protein complexes. Success of ligand docking, however, often depends on fine details of the binding interface, and thus may require accounting for conformational changes by simulation methods.
more » « less- Award ID(s):
- 1759472
- NSF-PAR ID:
- 10364826
- Publisher / Repository:
- Wiley Blackwell (John Wiley & Sons)
- Date Published:
- Journal Name:
- Proteins: Structure, Function, and Bioinformatics
- Volume:
- 89
- Issue:
- 12
- ISSN:
- 0887-3585
- Page Range / eLocation ID:
- p. 1922-1939
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
Abstract -
Abstract Water and ligand binding play critical roles in the structure and function of proteins, yet their binding sites and significance are difficult to predict a priori. Multiple solvent crystal structures (MSCS) is a method where several X‐ray crystal structures are solved, each in a unique solvent environment, with organic molecules that serve as probes of the protein surface for sites evolved to bind ligands, while the first hydration shell is essentially maintained. When superimposed, these structures contain a vast amount of information regarding hot spots of protein‐protein or protein‐ligand interactions, as well as conserved water‐binding sites retained with the change in solvent properties. Optimized mining of this information requires reliable structural data and a consistent, objective analysis tool. Detection of related solvent positions (DRoP) was developed to automatically organize and rank the water or small organic molecule binding sites within a given set of structures. It is a flexible tool that can also be used in conserved water analysis given multiple structures of any protein independent of the MSCS method. The DRoP output is an HTML format list of the solvent sites ordered by conservation rank in its population within the set of structures, along with renumbered and recolored PDB files for visualization and facile analysis. Here, we present a previously unpublished set of MSCS structures of bovine pancreatic ribonuclease A (RNase A) and use it together with published structures to illustrate the capabilities of DRoP.
-
Abstract Characterization of life processes at the molecular level requires structural details of protein interactions. The number of experimentally determined structures of protein–protein complexes accounts only for a fraction of known protein interactions. This gap in structural description of the interactome has to be bridged by modeling. An essential part of the development of structural modeling/docking techniques for protein interactions is databases of protein–protein complexes. They are necessary for studying protein interfaces, providing a knowledge base for docking algorithms, and developing intermolecular potentials, search procedures, and scoring functions. Development of protein–protein docking techniques requires thorough benchmarking of different parts of the docking protocols on carefully curated sets of protein–protein complexes. We present a comprehensive description of the D
ockground resource (http://dockground.compbio.ku.edu ) for structural modeling of protein interactions, including previously unpublished unbound docking benchmark set 4, and the X‐ray docking decoy set 2. The resource offers a variety of interconnected datasets of protein–protein complexes and other data for the development and testing of different aspects of protein docking methodologies. Based on protein–protein complexes extracted from the PDB biounit files, Dockground offers sets of X‐ray unbound, simulated unbound, model, and docking decoy structures. All datasets are freely available for download, as a whole or selecting specific structures, through a user‐friendly interface on one integrated website. -
Abstract In the ligand prediction category of CASP15, the challenge was to predict the positions and conformations of small molecules binding to proteins that were provided as amino acid sequences or as models generated by the AlphaFold2 program. For most targets, we used our template‐based ligand docking program ClusPro ligTBM, also implemented as a public server available at
https://ligtbm.cluspro.org/ . Since many targets had multiple chains and a number of ligands, several templates, and some manual interventions were required. In a few cases, no templates were found, and we had to use direct docking using the Glide program. Nevertheless, ligTBM was shown to be a very useful tool, and by any ranking criteria, our group was ranked among the top five best‐performing teams. In fact, all the best groups used template‐based docking methods. Thus, it appears that the AlphaFold2‐generated models, despite the high accuracy of the predicted backbone, have local differences from the x‐ray structure that make the use of direct docking methods more challenging. The results of CASP15 confirm that this limitation can be frequently overcome by homology‐based docking. -
ABSTRACT The heavily used protein–protein docking server ClusPro performs three computational steps as follows: (1) rigid body docking, (2) RMSD based clustering of the 1000 lowest energy structures, and (3) the removal of steric clashes by energy minimization. In response to challenges encountered in recent CAPRI targets, we added three new options to ClusPro. These are (1) accounting for small angle X‐ray scattering data in docking; (2) considering pairwise interaction data as restraints; and (3) enabling discrimination between biological and crystallographic dimers. In addition, we have developed an extremely fast docking algorithm based on 5D rotational manifold FFT, and an algorithm for docking flexible peptides that include known sequence motifs. We feel that these developments will further improve the utility of ClusPro. However, CAPRI emphasized several shortcomings of the current server, including the problem of selecting the right energy parameters among the five options provided, and the problem of selecting the best models among the 10 generated for each parameter set. In addition, results convinced us that further development is needed for docking homology models. Finally, we discuss the difficulties we have encountered when attempting to develop a refinement algorithm that would be computationally efficient enough for inclusion in a heavily used server. Proteins 2017; 85:435–444. © 2016 Wiley Periodicals, Inc.