Structural, regulatory and enzymatic proteins interact with DNA to maintain a healthy and functional genome. Yet, our structural understanding of how proteins interact with DNA is limited. We present MELD-DNA, a novel computational approach to predict the structures of protein–DNA complexes. The method combines molecular dynamics simulations with general knowledge or experimental information through Bayesian inference. The physical model is sensitive to sequence-dependent properties and conformational changes required for binding, while information accelerates sampling of bound conformations. MELD-DNA can: (i) sample multiple binding modes; (ii) identify the preferred binding mode from the ensembles; and (iii) provide qualitative binding preferences between DNA sequences. We first assess performance on a dataset of 15 protein–DNA complexes and compare it with state-of-the-art methodologies. Furthermore, for three selected complexes, we show sequence dependence effects of binding in MELD predictions. We expect that the results presented herein, together with the freely available software, will impact structural biology (by complementing DNA structural databases) and molecular recognition (by bringing new insights into aspects governing protein–DNA interactions).
We describe the performance of MELD‐accelerated molecular dynamics (MELDxMD) in determining protein structures in the NMR‐data‐assisted category in CASP13. Seeded from web server predictions, MELDxMD was found best in the NMR category, over 17 targets, outperforming the next‐best groups by a factor of ~4 in
- PAR ID:
- 10459197
- Publisher / Repository:
- Wiley Blackwell (John Wiley & Sons)
- Date Published:
- Journal Name:
- Proteins: Structure, Function, and Bioinformatics
- Volume:
- 87
- Issue:
- 12
- ISSN:
- 0887-3585
- Page Range / eLocation ID:
- p. 1333-1340
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
Abstract -
Abstract Protein structure determination is a very important topic in structural genomics, which helps people to understand varieties of biological functions such as protein-protein interactions, protein–DNA interactions and so on. Nowadays, nuclear magnetic resonance (NMR) has often been used to determine the three-dimensional structures of protein in vivo. This study aims to automate the peak picking step, the most important and tricky step in NMR structure determination. We propose to model the NMR spectrum by a mixture of bivariate Gaussian densities and use the stochastic approximation Monte Carlo algorithm as the computational tool to solve the problem. Under the Bayesian framework, the peak picking problem is casted as a variable selection problem. The proposed method can automatically distinguish true peaks from false ones without preprocessing the data. To the best of our knowledge, this is the first effort in the literature that tackles the peak picking problem for NMR spectrum data using Bayesian method.
-
Abstract The Protein Data Bank (PDB) is one of two archival resources for experimental data central to biomedical research and education worldwide (the other key Primary Data Archive in biology being the International Nucleotide Sequence Database Collaboration). The PDB currently houses >134,000 atomic level biomolecular structures determined by crystallography, NMR spectroscopy, and 3D electron microscopy. It was established in 1971 as the first open‐access, digital‐data resource in biology, and is managed by the Worldwide Protein Data Bank partnership (wwPDB; wwpdb.org). US PDB operations are conducted by the RCSB Protein Data Bank (RCSB PDB; RCSB.org; Rutgers University and UC San Diego) and funded by NSF, NIH, and DoE. The RCSB PDB serves as the global Archive Keeper for the wwPDB. During calendar 2016, >591 million structure data files were downloaded from the PDB by
Data Consumers working in every sovereign nation recognized by the United Nations. During this same period, the RCSB PDB processed >5300 new atomic level biomolecular structures plus experimental data and metadata coming into the archive fromData Depositors working in the Americas and Oceania. In addition, RCSB PDB served >1 million RCSB.org users worldwide with PDB data integrated with ∼40 external data resources providing rich structural views of fundamental biology, biomedicine, and energy sciences, and >600,000 PDB101.rcsb.org educational website users around the globe. RCSB PDB resources are described in detail together with metrics documenting the impact of access to PDB data on basic and applied research, clinical medicine, education, and the economy. -
Abstract The connectivity, conformation, tautomeric form, and dynamics of a new depsidone (perisalazinic acid) were characterized using one‐bond13C13C NMR scalar couplings (1
J CC) obtained from the INADEQUATE experiment. Characterization of perisalazinic acid using more conventional NMR techniques is problematic due to the extremely limited number of CH protons present. In the present study, 81 candidate structures were considered and a best fit structure was selected by comparing computed1J CCvalues for each candidate to 15 experimental values. Of the six flexible moieties in perisalazinic acid, three are adequately represented by a single orientation stabilized by intramolecular hydrogen bonding. The three remaining groups are present as mixtures of conformers with two sites consisting of a pair of conformations and another disordered over six orientations. This study demonstrates the feasibility of complete three‐dimensional structural characterization of an unknown using only theoretical and experimental1J CCvalues. -
Abstract There have been several studies suggesting that protein structures solved by NMR spectroscopy and X‐ray crystallography show significant differences. To understand the origin of these differences, we assembled a database of high‐quality protein structures solved by both methods. We also find significant differences between NMR and crystal structures—in the root‐mean‐square deviations of the C
α atomic positions, identities of core amino acids, backbone, and side‐chain dihedral angles, and packing fraction of core residues. In contrast to prior studies, we identify the physical basis for these differences by modeling protein cores as jammed packings of amino acid‐shaped particles. We find that we can tune the jammed packing fraction by varying the degree of thermalization used to generate the packings. For an athermal protocol, we find that the average jammed packing fraction is identical to that observed in the cores of protein structures solved by X‐ray crystallography. In contrast, highly thermalized packing‐generation protocols yield jammed packing fractions that are even higher than those observed in NMR structures. These results indicate that thermalized systems can pack more densely than athermal systems, which suggests a physical basis for the structural differences between protein structures solved by NMR and X‐ray crystallography.