skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Transcription factor binding specificities of the oomycete Phytophthora infestans reflect conserved and divergent evolutionary patterns and predict function
Abstract BackgroundIdentifying the DNA-binding specificities of transcription factors (TF) is central to understanding gene networks that regulate growth and development. Such knowledge is lacking in oomycetes, a microbial eukaryotic lineage within the stramenopile group. Oomycetes include many important plant and animal pathogens such as the potato and tomato blight agentPhytophthora infestans, which is a tractable model for studying life-stage differentiation within the group. ResultsMining of the P. infestans genome identified 197 genes encoding proteins belonging to 22 TF families. Their chromosomal distribution was consistent with family expansions through unequal crossing-over, which were likely ancient since each family had similar sizes in most oomycetes. Most TFs exhibited dynamic changes in RNA levels through the P. infestanslife cycle. The DNA-binding preferences of 123 proteins were assayed using protein-binding oligonucleotide microarrays, which succeeded with 73 proteins from 14 families. Binding sites predicted for representatives of the families were validated by electrophoretic mobility shift or chromatin immunoprecipitation assays. Consistent with the substantial evolutionary distance of oomycetes from traditional model organisms, only a subset of the DNA-binding preferences resembled those of human or plant orthologs. Phylogenetic analyses of the TF families withinP. infestansoften discriminated clades with canonical and novel DNA targets. Paralogs with similar binding preferences frequently had distinct patterns of expression suggestive of functional divergence. TFs were predicted to either drive life stage-specific expression or serve as general activators based on the representation of their binding sites within total or developmentally-regulated promoters. This projection was confirmed for one TF using synthetic and mutated promoters fused to reporter genesin vivo. ConclusionsWe established a large dataset of binding specificities forP. infestansTFs, representing the first in the stramenopile group. This resource provides a basis for understanding transcriptional regulation by linking TFs with their targets, which should help delineate the molecular components of processes such as sporulation and host infection. Our work also yielded insight into TF evolution during the eukaryotic radiation, revealing both functional conservation as well as diversification across kingdoms.  more » « less
Award ID(s):
2143897
PAR ID:
10581101
Author(s) / Creator(s):
; ; ; ; ;
Publisher / Repository:
Biomed Central
Date Published:
Journal Name:
BMC Genomics
Volume:
25
Issue:
1
ISSN:
1471-2164
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract DNA–transcription factor (TF) interactions are essential for gene regulation. Fully characterizing TF recognition specificities and identifying their genomic binding targets are important to understand TF function and regulatory networks. Recently, high-throughput sequencing technology HT-SELEX (high-throughput systematic evolution of ligands by exponential enrichment) has been used to measure hundreds of TFs, providing massive datasets that comprise TF binding preferences. However, there is a need to develop comprehensive computational modeling to fully extract and characterize critical TF binding preferences and fail to distinguish genome-wide binding targets. In this study, we developed a global pairwise model called DCA-Scapes trained with experimental HT-SELEX data. Our approach uncovered high-resolution TF recognition specificity landscapes, enabled the prediction of in vivo binding sequences, and was validated with ChIP-seq (ChIP sequencing) data. In addition, the DCA-Scapes model was utilized to refine the locations of binding regions and accurately identify the binding sites within the ChIP-seq enriched peaks. Moreover, we extended our model to cover the entire human genome, uncovering potential TF target sites that exhibit tissue-specific TF recognition across various cellular environments. 
    more » « less
  2. Abstract Many eukaryotic transcription factors (TF) form homodimer or heterodimer complexes to regulate gene expression. Dimerization of BASIC LEUCINE ZIPPER (bZIP) TFs are critical for their functions, but the molecular mechanism underlying the DNA binding and functional specificity of homo- versus heterodimers remains elusive. To address this gap, we present the double DNA Affinity Purification-sequencing (dDAP-seq) technique that maps heterodimer binding sites on endogenous genomic DNA. Using dDAP-seq we profile twenty pairs of C/S1 bZIP heterodimers and S1 homodimers in Arabidopsis and show that heterodimerization significantly expands the DNA binding preferences of these TFs. Analysis of dDAP-seq binding sites reveals the function of bZIP9 in abscisic acid response and the role of bZIP53 heterodimer-specific binding in seed maturation. The C/S1 heterodimers show distinct preferences for the ACGT elements recognized by plant bZIPs and motifs resembling the yeast GCN4 cis -elements. This study demonstrates the potential of dDAP-seq in deciphering the DNA binding specificities of interacting TFs that are key for combinatorial gene regulation. 
    more » « less
  3. Komeili, Arash (Ed.)
    ABSTRACT Histone proteins are found across diverse lineages of Archaea , many of which package DNA and form chromatin. However, previous research has led to the hypothesis that the histone-like proteins of high-salt-adapted archaea, or halophiles, function differently. The sole histone protein encoded by the model halophilic species Halobacterium salinarum , HpyA, is nonessential and expressed at levels too low to enable genome-wide DNA packaging. Instead, HpyA mediates the transcriptional response to salt stress. Here we compare the features of genome-wide binding of HpyA to those of HstA, the sole histone of another model halophile, Haloferax volcanii . hstA , like hpyA , is a nonessential gene. To better understand HpyA and HstA functions, protein-DNA binding data (chromatin immunoprecipitation sequencing [ChIP-seq]) of these halophilic histones are compared to publicly available ChIP-seq data from DNA binding proteins across all domains of life, including transcription factors (TFs), nucleoid-associated proteins (NAPs), and histones. These analyses demonstrate that HpyA and HstA bind the genome infrequently in discrete regions, which is similar to TFs but unlike NAPs, which bind a much larger genomic fraction. However, unlike TFs that typically bind in intergenic regions, HpyA and HstA binding sites are located in both coding and intergenic regions. The genome-wide dinucleotide periodicity known to facilitate histone binding was undetectable in the genomes of both species. Instead, TF-like and histone-like binding sequence preferences were detected for HstA and HpyA, respectively. Taken together, these data suggest that halophilic archaeal histones are unlikely to facilitate genome-wide chromatin formation and that their function defies categorization as a TF, NAP, or histone. IMPORTANCE Most cells in eukaryotic species—from yeast to humans—possess histone proteins that pack and unpack DNA in response to environmental cues. These essential proteins regulate genes necessary for important cellular processes, including development and stress protection. Although the histone fold domain originated in the domain of life Archaea , the function of archaeal histone-like proteins is not well understood relative to those of eukaryotes. We recently discovered that, unlike histones of eukaryotes, histones in hypersaline-adapted archaeal species do not package DNA and can act as transcription factors (TFs) to regulate stress response gene expression. However, the function of histones across species of hypersaline-adapted archaea still remains unclear. Here, we compare hypersaline histone function to a variety of DNA binding proteins across the tree of life, revealing histone-like behavior in some respects and specific transcriptional regulatory function in others. 
    more » « less
  4. Abstract Transcription factors (TF) are proteins that bind DNA in a sequence-specific manner to regulate gene transcription. Despite their unique intrinsic sequence preferences,in vivogenomic occupancy profiles of TFs differ across cellular contexts. Hence, deciphering the sequence determinants of TF binding, both intrinsic and context-specific, is essential to understand gene regulation and the impact of regulatory, non-coding genetic variation. Biophysical models trained onin vitroTF binding assays can estimate intrinsic affinity landscapes and predict occupancy based on TF concentration and affinity. However, these models cannot adequately explain context-specific,in vivobinding profiles. Conversely, deep learning models, trained onin vivoTF binding assays, effectively predict and explain genomic occupancy profiles as a function of complex regulatory sequence syntax, albeit without a clear biophysical interpretation. To reconcile these complementary models ofin vitroandin vivoTF binding, we developed Affinity Distillation (AD), a method that extracts thermodynamic affinitiesde-novofrom deep learning models of TF chromatin immunoprecipitation (ChIP) experiments by marginalizing away the influence of genomic sequence context. Applied to neural networks modeling diverse classes of yeast and mammalian TFs, AD predicts energetic impacts of sequence variation within and surrounding motifs on TF binding as measured by diversein vitroassays with superior dynamic range and accuracy compared to motif-based methods. Furthermore, AD can accurately discern affinities of TF paralogs. Our results highlight thermodynamic affinity as a key determinant ofin vivobinding, suggest that deep learning models ofin vivobinding implicitly learn high-resolution affinity landscapes, and show that these affinities can be successfully distilled using AD. This new biophysical interpretation of deep learning models enables high-throughputin silicoexperiments to explore the influence of sequence context and variation on both intrinsic affinity andin vivooccupancy. 
    more » « less
  5. Abstract Transcription factors are defined by their DNA-binding domains (DBDs). The binding affinities and specificities of a transcription factor to its DNA binding sites can be used by an organism to fine-tune gene regulation and so are targets for evolution. Here we investigate the evolution of GATA-type transcription factors (GATA factors) in theCaenorhabditisgenus. Based upon comparisons of their DBDs, these proteins form 13 distinct groups. This protein family experienced a burst of gene duplication in several of these groups along two short branches in the species tree, giving rise to subclades with very distinct complements of GATA factors. By comparing extant gene structures, DBD sequences, genome locations, and selection pressures we reconstructed how these duplications occurred. Although the paralogs have diverged in various ways, the literature shows that at least eight of the DBD groups bind to similar G-A-T-A DNA sequences. Thus, despite gene duplications and divergence among DBD sequences, mostCaenorhabditisGATA factors appear to have maintained similar binding preferences, which could create the opportunity for developmental system drift. We hypothesize that this limited divergence in binding specificities contributes to the apparent disconnect between the extensive genomic evolution that has occurred in this genus and the absence of significant anatomical changes. 
    more » « less