skip to main content


Title: Rapid Cis–Trans Coevolution Driven by a Novel Gene Retroposed from a Eukaryotic Conserved CCR4–NOT Component in Drosophila
Young, or newly evolved, genes arise ubiquitously across the tree of life, and they can rapidly acquire novel functions that influence a diverse array of biological processes. Previous work identified a young regulatory duplicate gene in Drosophila, Zeus that unexpectedly diverged rapidly from its parent, Caf40, an extremely conserved component in the CCR4–NOT machinery in post-transcriptional and post-translational regulation of eukaryotic cells, and took on roles in the male reproductive system. This neofunctionalization was accompanied by differential binding of the Zeus protein to loci throughout the Drosophila melanogaster genome. However, the way in which new DNA-binding proteins acquire and coevolve with their targets in the genome is not understood. Here, by comparing Zeus ChIP-Seq data from D. melanogaster and D. simulans to the ancestral Caf40 binding events from D. yakuba, a species that diverged before the duplication event, we found a dynamic pattern in which Zeus binding rapidly coevolved with a previously unknown DNA motif, which we term Caf40 and Zeus-Associated Motif (CAZAM), under the influence of positive selection. Interestingly, while both copies of Zeus acquired targets at male-biased and testis-specific genes, D. melanogaster and D. simulans proteins have specialized binding on different chromosomes, a pattern echoed in the evolution of the associated motif. Using CRISPR-Cas9-mediated gene knockout of Zeus and RNA-Seq, we found that Zeus regulated the expression of 661 differentially expressed genes (DEGs). Our results suggest that the evolution of young regulatory genes can be coupled to substantial rewiring of the transcriptional networks into which they integrate, even over short evolutionary timescales. Our results thus uncover dynamic genome-wide evolutionary processes associated with new genes.  more » « less
Award ID(s):
2020667
NSF-PAR ID:
10344449
Author(s) / Creator(s):
; ; ; ; ; ;
Date Published:
Journal Name:
Genes
Volume:
13
Issue:
1
ISSN:
2073-4425
Page Range / eLocation ID:
57
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. A major goal in evolutionary biology is to understand how natural variation is maintained in sexually selected and sexually dimorphic traits. Hypotheses to explain genetic variation in sexually selected traits include context-dependent fitness effects, epistatic interactions, and pleiotropic constraints. The house fly, Musca domestica, is a promising system to investigate how these factors affect polymorphism in sexually selected traits. Two common Y chromosomes (YM and IIIM) segregate as stable polymorphisms in natural house fly populations, appear to be locally adapted to different thermal habitats, and differentially affect male mating success. Here, we perform a meta-analysis of RNA-seq data which identifies genes encoding odorant binding proteins (in the Obp56h family) as differentially expressed between the heads of males carrying YM and IIIM Differential expression of Obp56h has been associated with variation in male mating behavior in Drosophila melanogaster. We find differences in male mating behavior between house flies carrying the Y chromosomes that are consistent with the relationship between male mating behavior and expression of Obp56h in D. melanogaster. We also find that male mating behaviors in house fly are affected by temperature, and the same temperature differentials further affect the expression of Obp56h genes. However, we show that temperature-dependent effects cannot explain the maintenance of genetic variation for male mating behavior in house fly. Using a network analysis and allele-specific expression measurements, we find evidence that the house fly IIIM chromosome is a trans regulator of Obp56h gene expression. Moreover, we find that Obp56h disproportionately affects the expression of genes on the D. melanogaster chromosome that is homologous to the house fly IIIM chromosome. This provides evidence for a conserved trans regulatory loop involving Obp56h expression that affects male mating behavior in flies. The complex regulatory architecture controlling Obp56h expression suggests that variation in male mating behavior could be maintained by epistasis or pleiotropic constraints. 
    more » « less
  2. Polen, Tino (Ed.)
    ABSTRACT Regulation of gene expression is a vital component of cellular biology. Transcription factor proteins often bind regulatory DNA sequences upstream of transcription start sites to facilitate the activation or repression of RNA polymerase. Research laboratories have devoted many projects to understanding the transcription regulatory networks for transcription factors, as these regulated genes provide critical insight into the biology of the host organism. Various in vivo and in vitro assays have been developed to elucidate transcription regulatory networks. Several assays, including SELEX-seq and ChIP-seq, capture DNA-bound transcription factors to determine the preferred DNA-binding sequences, which can then be mapped to the host organism’s genome to identify candidate regulatory genes. In this protocol, we describe an alternative in vitro , iterative selection approach to ascertaining DNA-binding sequences of a transcription factor of interest using restriction endonuclease, protection, selection, and amplification (REPSA). Contrary to traditional antibody-based capture methods, REPSA selects for transcription factor-bound DNA sequences by challenging binding reactions with a type IIS restriction endonuclease. Cleavage-resistant DNA species are amplified by PCR and then used as inputs for the next round of REPSA. This process is repeated until a protected DNA species is observed by gel electrophoresis, which is an indication of a successful REPSA experiment. Subsequent high-throughput sequencing of REPSA-selected DNAs accompanied by motif discovery and scanning analyses can be used for determining transcription factor consensus binding sequences and potential regulated genes, providing critical first steps in determining organisms’ transcription regulatory networks. IMPORTANCE Transcription regulatory proteins are an essential class of proteins that help maintain cellular homeostasis by adapting the transcriptome based on environmental cues. Dysregulation of transcription factors can lead to diseases such as cancer, and many eukaryotic and prokaryotic transcription factors have become enticing therapeutic targets. Additionally, in many understudied organisms, the transcription regulatory networks for uncharacterized transcription factors remain unknown. As such, the need for experimental techniques to establish transcription regulatory networks is paramount. Here, we describe a step-by-step protocol for REPSA, an inexpensive, iterative selection technique to identify transcription factor-binding sequences without the need for antibody-based capture methods. 
    more » « less
  3. Many organisms enter a dormant state in their life cycle to deal with predictable changes in environments over the course of a year. The timing of dormancy is therefore a key seasonal adaptation, and it evolves rapidly with changing environments. We tested the hypothesis that differences in the timing of seasonal activity are driven by differences in the rate of development during diapause in Rhagoletis pomonella , a fly specialized to feed on fruits of seasonally limited host plants. Transcriptomes from the central nervous system across a time series during diapause show consistent and progressive changes in transcripts participating in diverse developmental processes, despite a lack of gross morphological change. Moreover, population genomic analyses suggested that many genes of small effect enriched in developmental functional categories underlie variation in dormancy timing and overlap with gene sets associated with development rate in Drosophila melanogaster . Our transcriptional data also suggested that a recent evolutionary shift from a seasonally late to a seasonally early host plant drove more rapid development during diapause in the early fly population. Moreover, genetic variants that diverged during the evolutionary shift were also enriched in putative cis regulatory regions of genes differentially expressed during diapause development. Overall, our data suggest polygenic variation in the rate of developmental progression during diapause contributes to the evolution of seasonality in R. pomonella . We further discuss patterns that suggest hourglass-like developmental divergence early and late in diapause development and an important role for hub genes in the evolution of transcriptional divergence. 
    more » « less
  4. Komeili, Arash (Ed.)
    ABSTRACT Histone proteins are found across diverse lineages of Archaea , many of which package DNA and form chromatin. However, previous research has led to the hypothesis that the histone-like proteins of high-salt-adapted archaea, or halophiles, function differently. The sole histone protein encoded by the model halophilic species Halobacterium salinarum , HpyA, is nonessential and expressed at levels too low to enable genome-wide DNA packaging. Instead, HpyA mediates the transcriptional response to salt stress. Here we compare the features of genome-wide binding of HpyA to those of HstA, the sole histone of another model halophile, Haloferax volcanii . hstA , like hpyA , is a nonessential gene. To better understand HpyA and HstA functions, protein-DNA binding data (chromatin immunoprecipitation sequencing [ChIP-seq]) of these halophilic histones are compared to publicly available ChIP-seq data from DNA binding proteins across all domains of life, including transcription factors (TFs), nucleoid-associated proteins (NAPs), and histones. These analyses demonstrate that HpyA and HstA bind the genome infrequently in discrete regions, which is similar to TFs but unlike NAPs, which bind a much larger genomic fraction. However, unlike TFs that typically bind in intergenic regions, HpyA and HstA binding sites are located in both coding and intergenic regions. The genome-wide dinucleotide periodicity known to facilitate histone binding was undetectable in the genomes of both species. Instead, TF-like and histone-like binding sequence preferences were detected for HstA and HpyA, respectively. Taken together, these data suggest that halophilic archaeal histones are unlikely to facilitate genome-wide chromatin formation and that their function defies categorization as a TF, NAP, or histone. IMPORTANCE Most cells in eukaryotic species—from yeast to humans—possess histone proteins that pack and unpack DNA in response to environmental cues. These essential proteins regulate genes necessary for important cellular processes, including development and stress protection. Although the histone fold domain originated in the domain of life Archaea , the function of archaeal histone-like proteins is not well understood relative to those of eukaryotes. We recently discovered that, unlike histones of eukaryotes, histones in hypersaline-adapted archaeal species do not package DNA and can act as transcription factors (TFs) to regulate stress response gene expression. However, the function of histones across species of hypersaline-adapted archaea still remains unclear. Here, we compare hypersaline histone function to a variety of DNA binding proteins across the tree of life, revealing histone-like behavior in some respects and specific transcriptional regulatory function in others. 
    more » « less
  5. The rapid evolution of repetitive DNA sequences, including satellite DNA, tandem duplications, and transposable elements, underlies phenotypic evolution and contributes to hybrid incompatibilities between species. However, repetitive genomic regions are fragmented and misassembled in most contemporary genome assemblies. We generated highly contiguous de novo reference genomes for the Drosophila simulans species complex ( D. simulans , D. mauritiana , and D. sechellia ), which speciated ∼250,000 yr ago. Our assemblies are comparable in contiguity and accuracy to the current D. melanogaster genome, allowing us to directly compare repetitive sequences between these four species. We find that at least 15% of the D. simulans complex species genomes fail to align uniquely to D. melanogaster owing to structural divergence—twice the number of single-nucleotide substitutions. We also find rapid turnover of satellite DNA and extensive structural divergence in heterochromatic regions, whereas the euchromatic gene content is mostly conserved. Despite the overall preservation of gene synteny, euchromatin in each species has been shaped by clade- and species-specific inversions, transposable elements, expansions and contractions of satellite and tRNA tandem arrays, and gene duplications. We also find rapid divergence among Y-linked genes, including copy number variation and recent gene duplications from autosomes. Our assemblies provide a valuable resource for studying genome evolution and its consequences for phenotypic evolution in these genetic model species. 
    more » « less