skip to main content

This content will become publicly available on September 22, 2024

Title: Short tandem repeats bind transcription factors to tune eukaryotic gene expression
Short tandem repeats (STRs) are enriched in eukaryoticcis-regulatory elements and alter gene expression, yet how they regulate transcription remains unknown. We found that STRs modulate transcription factor (TF)–DNA affinities and apparent on-rates by about 70-fold by directly binding TF DNA-binding domains, with energetic impacts exceeding many consensus motif mutations. STRs maximize the number of weakly preferred microstates near target sites, thereby increasing TF density, with impacts well predicted by statistical mechanics. Confirming that STRs also affect TF binding in cells, neural networks trained only on in vivo occupancies predicted effects identical to those observed in vitro. Approximately 90% of TFs preferentially bound STRs that need not resemble known motifs, providing a cis-regulatory mechanism to target TFs to genomic sites.  more » « less
Award ID(s):
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ; ; ; ; ;
Publisher / Repository:
Date Published:
Journal Name:
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Summary

    Adverse environmental conditions reduce crop productivity and often increase the load of unfolded or misfolded proteins in the endoplasmic reticulum (ER). This potentially lethal condition, known as ER stress, is buffered by the unfolded protein response (UPR), a set of signaling pathways designed to either recover ER functionality or ignite programmed cell death. Despite the biological significance of the UPR to the life of the organism, the regulatory transcriptional landscape underpinning ER stress management is largely unmapped, especially in crops. To fill this significant knowledge gap, we performed a large‐scale systems‐level analysis of the protein–DNA interaction (PDI) network in maize (Zea mays). Using 23 promoter fragments of six UPR marker genes in a high‐throughput enhanced yeast one‐hybrid assay, we identified a highly interconnected network of 262 transcription factors (TFs) associated with significant biological traits and 831 PDIs underlying the UPR. We established a temporal hierarchy of TF binding to gene promoters within the same family as well as across different families of TFs. Cistrome analysis revealed the dynamic activities of a variety ofcis‐regulatory elements (CREs) in ER stress‐responsive gene promoters. By integrating the cistrome results into a TF network analysis, we mapped a subnetwork of TFs associated with a CRE that may contribute to UPR management. Finally, we validated the role of a predicted network hub gene using the Arabidopsis system. The PDIs, TF networks, and CREs identified in our work are foundational resources for understanding transcription‐regulatory mechanisms in the stress responses and crop improvement.

    more » « less
  2. Abstract

    Cooperative DNA-binding by transcription factor (TF) proteins is critical for eukaryotic gene regulation. In the human genome, many regulatory regions contain TF-binding sites in close proximity to each other, which can facilitate cooperative interactions. However, binding site proximity does not necessarily imply cooperative binding, as TFs can also bind independently to each of their neighboring target sites. Currently, the rules that drive cooperative TF binding are not well understood. In addition, it is oftentimes difficult to infer direct TF–TF cooperativity from existing DNA-binding data. Here, we show that in vitro binding assays using DNA libraries of a few thousand genomic sequences with putative cooperative TF-binding events can be used to develop accurate models of cooperativity and to gain insights into cooperative binding mechanisms. Using factors ETS1 and RUNX1 as our case study, we show that the distance and orientation between ETS1 sites are critical determinants of cooperative ETS1–ETS1 binding, while cooperative ETS1–RUNX1 interactions show more flexibility in distance and orientation and can be accurately predicted based on the affinity and sequence/shape features of the binding sites. The approach described here, combining custom experimental design with machine-learning modeling, can be easily applied to study the cooperative DNA-binding patterns of any TFs.

    more » « less
  3. Abstract

    AUXIN RESPONSE FACTORS (ARFs) are plant-specific transcription factors (TFs) that couple perception of the hormone auxin to gene expression programs essential to all land plants. As with many large TF families, a key question is whether individual members determine developmental specificity by binding distinct target genes. We use DAP-seq to generate genome-wide in vitro TF:DNA interaction maps for fourteen maize ARFs from the evolutionarily conserved A and B clades. Comparative analysis reveal a high degree of binding site overlap for ARFs of the same clade, but largely distinct clade A and B binding. Many sites are however co-occupied by ARFs from both clades, suggesting transcriptional coordination for many genes. Among these, we investigate known QTLs and use machine learning to predict the impact ofcis-regulatory variation. Overall, large-scale comparative analysis of ARF binding suggests that auxin response specificity may be determined by factors other than individual ARF binding site selection.

    more » « less
  4. Abstract

    We employed several algorithms with high efficacy to analyze the public transcriptomic data, aiming to identify key transcription factors (TFs) that regulate regeneration inArabidopsis thaliana. Initially, we utilized CollaborativeNet, also known as TF-Cluster, to construct a collaborative network of all TFs, which was subsequently decomposed into many subnetworks using the Triple-Link and Compound Spring Embedder (CoSE) algorithms. Functional analysis of these subnetworks led to the identification of nine subnetworks closely associated with regeneration. We further applied principal component analysis and gene ontology (GO) enrichment analysis to reduce the subnetworks from nine to three, namely subnetworks 1, 12, and 17. Searching for TF-binding sites in the promoters of the co-expressed and co-regulated (CCGs) genes of all TFs in these three subnetworks and Triple-Gene Mutual Interaction analysis of TFs in these three subnetworks with the CCGs involved in regeneration enabled us to rank the TFs in each subnetwork. Finally, six potential candidate TFs—WOX9A, LEC2, PGA37, WIP5, PEI1, and AIL1 from subnetwork 1—were identified, and their roles in somatic embryogenesis (GO:0010262) and regeneration (GO:0031099) were discussed, so were the TFs in Subnetwork 12 and 17 associated with regeneration. The TFs identified were also assessed using the CIS-BP database and Expression Atlas. Our analyses suggest some novel TFs that may have regulatory roles in regeneration and embryogenesis and provide valuable data and insights into the regulatory mechanisms related to regeneration. The tools and the procedures used here are instrumental for analyzing high-throughput transcriptomic data and advancing our understanding of the regulation of various biological processes of interest.

    more » « less
  5. Abstract

    Many eukaryotic transcription factors (TF) form homodimer or heterodimer complexes to regulate gene expression. Dimerization of BASIC LEUCINE ZIPPER (bZIP) TFs are critical for their functions, but the molecular mechanism underlying the DNA binding and functional specificity of homo-versusheterodimers remains elusive. To address this gap, we present the double DNA Affinity Purification-sequencing (dDAP-seq) technique that maps heterodimer binding sites on endogenous genomic DNA. Using dDAP-seq we profile twenty pairs of C/S1 bZIP heterodimers and S1 homodimers inArabidopsisand show that heterodimerization significantly expands the DNA binding preferences of these TFs. Analysis of dDAP-seq binding sites reveals the function of bZIP9 in abscisic acid response and the role of bZIP53 heterodimer-specific binding in seed maturation. The C/S1 heterodimers show distinct preferences for the ACGT elements recognized by plant bZIPs and motifs resembling the yeast GCN4cis-elements. This study demonstrates the potential of dDAP-seq in deciphering the DNA binding specificities of interacting TFs that are key for combinatorial gene regulation.

    more » « less