Protein–DNA interactions play an important role in various biological processes such as gene expression, replication, and transcription. Understanding the important features that dictate the binding affinity of protein-DNA complexes and predicting their affinities is important for elucidating their recognition mechanisms. In this work, we have collected the experimental binding free energy (ΔG) for a set of 391 Protein-DNA complexes and derived several structure-based features such as interaction energy, contact potentials, volume and surface area of binding site residues, base step parameters of the DNA and contacts between different types of atoms. Our analysis on relationship between binding affinity and structural features revealed that the important factors mainly depend on the number of DNA strands as well as functional and structural classes of proteins. Specifically, binding site properties such as number of atom contacts between the DNA and protein, volume of protein binding sites and interaction-based features such as interaction energies and contact potentials are important to understand the binding affinity. Further, we developed multiple regression equations for predicting the binding affinity of protein-DNA complexes belonging to different structural and functional classes. Our method showed an average correlation and mean absolute error of 0.78 and 0.98 kcal/mol, respectively, between the experimental and predicted binding affinities on a jack-knife test. We have developed a webserver, PDA-PreD (Protein-DNA Binding affinity predictor), for predicting the affinity of protein-DNA complexes and it is freely available at https://web.iitm.ac.in/bioinfo2/pdapred/
more »
« less
Brewing COFFEE: A Sequence-Specific Coarse-Grained Energy Function for Simulations of DNA−Protein Complexes
DNA−protein interactions are pervasive in a number of biophysical processes ranging from transcription and gene expression to chromosome folding. To describe the structural and dynamic properties underlying these processes accurately, it is important to create transferable computational models. Toward this end, we introduce Coarse-grained Force Field for Energy Estimation, COFFEE, a robust framework for simulating DNA− protein complexes. To brew COFFEE, we integrated the energy function in the self-organized polymer model with side-chains for proteins and the three interaction site model for DNA in a modular fashion, without recalibrating any of the parameters in the original force-fields. A unique feature of COFFEE is that it describes sequence−specific DNA−protein interactions using a statistical potential (SP) derived from a data set of high-resolution crystal structures. The only parameter in COFFEE is the strength (λDNAPRO) of the DNA−protein contact potential. For an optimal choice of λDNAPRO, the crystallographic B-factors for DNA−protein complexes with varying sizes and topologies are quantitatively reproduced. Without any further readjustments to the force-field parameters, COFFEE predicts scattering profiles that are in quantitative agreement with small-angle X-ray scattering experiments, as well as chemical shifts that are consistent with NMR. We also show that COFFEE accurately describes the salt-induced unraveling of nucleosomes. Strikingly, our nucleosome simulations explain the destabilization effect of ARG to LYS mutations, which do not alter the balance of electrostatic interactions but affect chemical interactions in subtle ways. The range of applications attests to the transferability of COFFEE, and we anticipate that it would be a promising framework for simulating DNA−protein complexes at the molecular length-scale.
more »
« less
- Award ID(s):
- 2320256
- PAR ID:
- 10549780
- Publisher / Repository:
- acs.org
- Date Published:
- Journal Name:
- Journal of Chemical Theory and Computation
- Volume:
- 20
- Issue:
- 3
- ISSN:
- 1549-9618
- Page Range / eLocation ID:
- 1398 to 1413
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Protein-DNA binding interactions are critical in several biological processes, especially the regulation of gene expression at the level of transcription initiation. An important technique for studying these interactions is the electrophoretic mobility shift assay (EMSA), whereby protein-DNA complexes are resolved on the basis of their mass:charge ratio using native polyacrylamide gel electrophoresis (nPAGE). Here we describe EMSA using PCR-generated, near infrared-fluorescent DNA probes, and IR fluorescence imaging to qualitatively and quantitatively study the interaction of transcriptional regulatory proteins from thermophilic organisms with different DNAs. Direct imaging of IR fluorophore-labeled DNA probes is advantageous because it provides high sensitivity (subnanomolar) without the need for intermediate staining steps or costly and problematic radiolabeled probes, thereby providing a more affordable and sensitive option to image protein-DNA on polyacrylamide gels by techniques such as EMSA.more » « less
-
null (Ed.)The emerging field of hybrid DNA–protein nanotechnology brings with it the potential for many novel materials which combine the addressability of DNA nanotechnology with the versatility of protein interactions. However, the design and computational study of these hybrid structures is difficult due to the system sizes involved. To aid in the design and in silico analysis process, we introduce here a coarse-grained DNA/RNA–protein model that extends the oxDNA/oxRNA models of DNA/RNA with a coarse-grained model of proteins based on an anisotropic network model representation. Fully equipped with analysis scripts and visualization, our model aims to facilitate hybrid nanomaterial design towards eventual experimental realization, as well as enabling study of biological complexes. We further demonstrate its usage by simulating DNA–protein nanocage, DNA wrapped around histones, and a nascent RNA in polymerase.more » « less
-
Computational modeling of assembly is challenging for many systems, because their timescales can vastly exceed those accessible to simulations. This article describes the multiMSM, which is a general framework that uses Markov state models (MSMs) to enable simulating self-assembly and self-organization of finite-sized structures on timescales that are orders of magnitude longer than those accessible to brute-force dynamics simulations. As with traditional MSM approaches, the method efficiently overcomes free energy barriers and other dynamical bottlenecks. In contrast to previous MSM approaches to simulating assembly, the framework describes simultaneous assembly of many clusters and the consequent depletion of free subunits or other small oligomers. The algorithm accounts for changes in transition rates as concentrations of monomers and intermediates evolve over the course of the reaction. Using two model systems, we show that the multiMSM accurately predicts the concentrations of the full ensemble of intermediates on timescales required to reach equilibrium. Importantly, after constructing a multiMSM for one system concentration, yields at other concentrations can be approximately calculated without any further sampling. This capability allows for orders of magnitude additional speedup. In addition, the method enables highly efficient calculation of quantities such as free energy profiles, nucleation timescales, flux along the ensemble of assembly pathways, and entropy production rates. Identifying contributions of individual transitions to entropy production rates reveals sources of kinetic traps. The method is broadly applicable to systems with equilibrium or nonequilibrium dynamics and is trivially parallelizable and, thus, highly scalable. Published by the American Physical Society2024more » « less
-
Abstract Nanofluidic structures have over the last two decades emerged as a powerful platform for detailed analysis of DNA on the kilobase pair length scale. When DNA is confined to a nanochannel, the combination of excluded volume and DNA stiffness leads to the DNA being stretched to near its full contour length. Importantly, this stretching takes place at equilibrium, without any chemical modifications to the DNA. As a result, any DNA can be analyzed, such as DNA extracted from cells or circular DNA, and it is straight-forward to study reactions on the ends of linear DNA. In this comprehensive review, we first give a thorough description of the current understanding of the polymer physics of DNA and how that leads to stretching in nanochannels. We then describe how the versatility of nanofabrication can be used to design devices specifically tailored for the problem at hand, either by controlling the degree of confinement or enabling facile exchange of reagents to measure DNA–protein reaction kinetics. The remainder of the review focuses on two important applications of confining DNA in nanochannels. The first is optical DNA mapping, which provides the genomic sequence of intact DNA molecules in excess of 100 kilobase pairs in size, with kilobase pair resolution, through labeling strategies that are suitable for fluorescence microscopy. In this section, we highlight solutions to the technical aspects of genomic mapping, including the use of enzyme-based labeling and affinity-based labeling to produce the genomic maps, rather than recent applications in human genetics. The second is DNA–protein interactions, and several recent examples of such studies on DNA compaction, filamentous protein complexes, and reactions with DNA ends are presented. Taken together, these two applications demonstrate the power of DNA confinement and nanofluidics in genomics, molecular biology, and biophysics.more » « less
An official website of the United States government

