Many eukaryotic transcription factors (TF) form homodimer or heterodimer complexes to regulate gene expression. Dimerization of BASIC LEUCINE ZIPPER (bZIP) TFs are critical for their functions, but the molecular mechanism underlying the DNA binding and functional specificity of homo-
Cooperative DNA-binding by transcription factor (TF) proteins is critical for eukaryotic gene regulation. In the human genome, many regulatory regions contain TF-binding sites in close proximity to each other, which can facilitate cooperative interactions. However, binding site proximity does not necessarily imply cooperative binding, as TFs can also bind independently to each of their neighboring target sites. Currently, the rules that drive cooperative TF binding are not well understood. In addition, it is oftentimes difficult to infer direct TF–TF cooperativity from existing DNA-binding data. Here, we show that in vitro binding assays using DNA libraries of a few thousand genomic sequences with putative cooperative TF-binding events can be used to develop accurate models of cooperativity and to gain insights into cooperative binding mechanisms. Using factors ETS1 and RUNX1 as our case study, we show that the distance and orientation between ETS1 sites are critical determinants of cooperative ETS1–ETS1 binding, while cooperative ETS1–RUNX1 interactions show more flexibility in distance and orientation and can be accurately predicted based on the affinity and sequence/shape features of the binding sites. The approach described here, combining custom experimental design with machine-learning modeling, can be easily applied to study the cooperative DNA-binding patterns of any TFs.
more » « less- PAR ID:
- 10471179
- Publisher / Repository:
- Oxford University Press
- Date Published:
- Journal Name:
- Nucleic Acids Research
- Volume:
- 51
- Issue:
- 21
- ISSN:
- 0305-1048
- Format(s):
- Medium: X Size: p. 11600-11612
- Size(s):
- p. 11600-11612
- Sponsoring Org:
- National Science Foundation
More Like this
-
Abstract versus heterodimers remains elusive. To address this gap, we present the double DNA Affinity Purification-sequencing (dDAP-seq) technique that maps heterodimer binding sites on endogenous genomic DNA. Using dDAP-seq we profile twenty pairs of C/S1 bZIP heterodimers and S1 homodimers inArabidopsis and show that heterodimerization significantly expands the DNA binding preferences of these TFs. Analysis of dDAP-seq binding sites reveals the function of bZIP9 in abscisic acid response and the role of bZIP53 heterodimer-specific binding in seed maturation. The C/S1 heterodimers show distinct preferences for the ACGT elements recognized by plant bZIPs and motifs resembling the yeast GCN4cis -elements. This study demonstrates the potential of dDAP-seq in deciphering the DNA binding specificities of interacting TFs that are key for combinatorial gene regulation. -
DNA base damage arises frequently in living cells and needs to be removed by base excision repair (BER) to prevent mutagenesis and genome instability. Both the formation and repair of base damage occur in chromatin and are conceivably affected by DNA-binding proteins such as transcription factors (TFs). However, to what extent TF binding affects base damage distribution and BER in cells is unclear. Here, we used a genome-wide damage mapping method, N -methylpurine-sequencing (NMP-seq), and characterized alkylation damage distribution and BER at TF binding sites in yeast cells treated with the alkylating agent methyl methanesulfonate (MMS). Our data show that alkylation damage formation was mainly suppressed at the binding sites of yeast TFs ARS binding factor 1 (Abf1) and rDNA enhancer binding protein 1 (Reb1), but individual hotspots with elevated damage levels were also found. Additionally, Abf1 and Reb1 binding strongly inhibits BER in vivo and in vitro, causing slow repair both within the core motif and its adjacent DNA. Repair of ultraviolet (UV) damage by nucleotide excision repair (NER) was also inhibited by TF binding. Interestingly, TF binding inhibits a larger DNA region for NER relative to BER. The observed effects are caused by the TF–DNA interaction, because damage formation and BER can be restored by depletion of Abf1 or Reb1 protein from the nucleus. Thus, our data reveal that TF binding significantly modulates alkylation base damage formation and inhibits repair by the BER pathway. The interplay between base damage formation and BER may play an important role in affecting mutation frequency in gene regulatory regions.more » « less
-
Abstract Development of the malaria parasite, Plasmodium falciparum, is regulated by a limited number of sequence-specific transcription factors (TFs). However, the mechanisms by which these TFs recognize genome-wide binding sites is largely unknown. To address TF specificity, we investigated the binding of two TF subsets that either bind CACACA or GTGCAC DNA sequence motifs and further characterized two additional ApiAP2 TFs, PfAP2-G and PfAP2-EXP, which bind unique DNA motifs (GTAC and TGCATGCA). We also interrogated the impact of DNA sequence and chromatin context on P. falciparum TF binding by integrating high-throughput in vitro and in vivo binding assays, DNA shape predictions, epigenetic post-translational modifications, and chromatin accessibility. We found that DNA sequence context minimally impacts binding site selection for paralogous CACACA-binding TFs, while chromatin accessibility, epigenetic patterns, co-factor recruitment, and dimerization correlate with differential binding. In contrast, GTGCAC-binding TFs prefer different DNA sequence context in addition to chromatin dynamics. Finally, we determined that TFs that preferentially bind divergent DNA motifs may bind overlapping genomic regions due to low-affinity binding to other sequence motifs. Our results demonstrate that TF binding site selection relies on a combination of DNA sequence and chromatin features, thereby contributing to the complexity of P. falciparum gene regulatory mechanisms.
-
Abstract Protein binding microarrays provide comprehensive information about the DNA binding specificities of transcription factors (TFs), and can be used to quantitatively predict the effects of DNA sequence variation on TF binding. There has also been substantial progress in dissecting the patterns of mutations, i.e., the mutational signatures, generated by different mutational processes. By combining these two layers of information we can investigate whether certain mutational processes tend to preferentially affect binding of particular classes of TFs. Such preferential alterations of binding might predispose to particular oncogenic pathways. We developed and implemented a method, termed Signature-QBiC, that integrates protein binding microarray data with the signatures of mutational processes, with the aim of predicting which TFs’ binding profiles are preferentially perturbed by particular mutational processes. We used Signature-QBiC to predict the effects of 47 signatures of mutational processes on 582 human TFs. Pathway analysis showed that binding of TFs involved in NOTCH1 signaling is strongly affected by the signatures of several mutational processes, including exposure to ultraviolet radiation. Additionally, toll-like-receptor signaling pathways are also vulnerable to disruption by this exposure. This study provides a novel overview of the effects of mutational processes on TF binding and the potential of these processes to activate oncogenic pathways through mutating TF binding sites.
-
null (Ed.)Characterizing genome-wide binding profiles of transcription factors (TFs) is essential for understanding biological processes. Although techniques have been developed to assess binding profiles within a population of cells, determining them at a single-cell level remains elusive. Here, we report scFAN (single-cell factor analysis network), a deep learning model that predicts genome-wide TF binding profiles in individual cells. scFAN is pretrained on genome-wide bulk assay for transposase-accessible chromatin sequencing (ATAC-seq), DNA sequence, and chromatin immunoprecipitation sequencing (ChIP-seq) data and uses single-cell ATAC-seq to predict TF binding in individual cells. We demonstrate the efficacy of scFAN by both studying sequence motifs enriched within predicted binding peaks and using predicted TFs for discovering cell types. We develop a new metric “TF activity score” to characterize each cell and show that activity scores can reliably capture cell identities. scFAN allows us to discover and study cellular identities and heterogeneity based on chromatin accessibility profiles.more » « less