skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on December 20, 2025

Title: The three-dimensional genome drives the evolution of asymmetric gene duplicates via enhancer capture-divergence
Previous evolutionary models of duplicate gene evolution have overlooked the pivotal role of genome architecture. Here, we show that proximity-based regulatory recruitment by distally duplicated genes is an efficient mechanism for modulating tissue-specific production of preexisting proteins. By leveraging genomic asymmetries, we performed a coexpression analysis onDrosophila melanogastertissue data to show the generality of enhancer capture-divergence (ECD) as a significant evolutionary driver of asymmetric, distally duplicated genes. We use the recently evolved geneHP6/Umbreaas an example of the ECD process. By assaying genome-wide chromosomal conformations in multipleDrosophilaspecies, we show thatHP6/Umbreawas inserted near a preexisting, long-distance three-dimensional genomic interaction. We then use this data to identify a newly found enhancer (FLEE1), buried within the coding region of the highly conserved, essential geneMFS18, that likely neofunctionalizedHP6/Umbrea. Last, we demonstrate ancestral transcriptional coregulation ofHP6/Umbrea’s future insertion site, illustrating how enhancer capture provides a highly evolvable, one-step solution to Ohno’s dilemma.  more » « less
Award ID(s):
2410289
PAR ID:
10580276
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ;
Publisher / Repository:
American Association for the Advancement of Science
Date Published:
Journal Name:
Science Advances
Volume:
10
Issue:
51
ISSN:
2375-2548
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Saitou, Naruya (Ed.)
    Abstract Enhancers are often studied as noncoding regulatory elements that modulate the precise spatiotemporal expression of genes in a highly tissue-specific manner. This paradigm has been challenged by recent evidence of individual enhancers acting in multiple tissues or developmental contexts. However, the frequency of these enhancers with high degrees of “pleiotropy” out of all putative enhancers is not well understood. Consequently, it is unclear how the variation of enhancer pleiotropy corresponds to the variation in expression breadth of target genes. Here, we use multi-tissue chromatin maps from diverse human tissues to investigate the enhancer–gene interaction architecture while accounting for 1) the distribution of enhancer pleiotropy, 2) the variations of regulatory links from enhancers to target genes, and 3) the expression breadth of target genes. We show that most enhancers are tissue-specific and that highly pleiotropy enhancers account for <1% of all putative regulatory sequences in the human genome. Notably, several genomic features are indicative of increasing enhancer pleiotropy, including longer sequence length, greater number of links to genes, increasing abundance and diversity of encoded transcription factor motifs, and stronger evolutionary conservation. Intriguingly, the number of enhancers per gene remains remarkably consistent for all genes (∼14). However, enhancer pleiotropy does not directly translate to the expression breadth of target genes. We further present a series of Gaussian Mixture Models to represent this organization architecture. Consequently, we demonstrate that a modest trend of more pleiotropic enhancers targeting more broadly expressed genes can generate the observed diversity of expression breadths in the human genome. 
    more » « less
  2. Transcriptional divergence of duplicated genes after whole genome duplication (WGD) has been described in many plant lineages and is often associated with subgenome dominance, a genome-wide mechanism. However, it is unknown what underlies the transcriptional divergence of duplicated genes in polyploid species that lack subgenome dominance. Soybean is a paleotetraploid with a WGD that occurred 5 to 13 Mya. Approximately 50% of the duplicated genes retained from this WGD exhibit transcriptional divergence. We developed accessible chromatin region (ACR) datasets from leaf, flower, and seed tissues using MNase-hypersensitivity sequencing. We validated enhancer function of several ACRs associated with known genes using CRISPR/Cas9-mediated genome editing. The ACR datasets were used to examine and correlate the transcriptional patterns of 17,111 pairs of duplicated genes in different tissues. We demonstrate that ACR dynamics are correlated with divergence of both expression level and tissue specificity of individual gene pairs. Gain or loss of flanking ACRs and mutation ofcis-regulatory elements (CREs) within the ACRs can change the balance of the expression level and/or tissue specificity of the duplicated genes. Analysis of DNA sequences associated with ACRs revealed that the extensive sequence rearrangement after the WGD reshaped the CRE landscape, which appears to play a key role in the transcriptional divergence of duplicated genes in soybean. This may represent a general mechanism for transcriptional divergence of duplicated genes in polyploids that lack subgenome dominance. 
    more » « less
  3. Abstract Gene duplication is a fundamental part of evolutionary innovation. While single-gene duplications frequently exhibit asymmetric evolutionary rates between paralogs, the extent to which this applies to multi-gene duplications remains unclear. In this study, we investigate the role of genetic context in shaping evolutionary divergence within multi-gene duplications, leveraging microsynteny to differentiate source and target copies. Using a dataset of 193 mammalian genome assemblies and a bird outgroup, we systematically analyze patterns of sequence divergence between duplicated genes and reference orthologs. We find that target copies, those relocated to new genomic environments, exhibit elevated evolutionary rates compared to source copies in the ancestral location. This asymmetry is influenced by the distance between copies and the size of the target copy. We also demonstrate that the polarization of rate asymmetry in paralogs, the “choice” of the slowly evolving copy, is biased towards collective, block-wise polarization in multi-gene duplications. Our findings highlight the importance of genetic context in modulating post-duplication divergence, where differences in cis-regulatory elements and co-expressed gene clusters between source and target copies may be responsible. This study presents a large-scale test of asymmetric evolution in multi-gene duplications, offering new insight into how genome architecture shapes functional diversification of paralogs. Significance statementAfter a gene is duplicated, reduced selective constraints can lead the two copies to rapidly diverge, with one copy often evolving faster and occasionally gaining a new function. We quantify the influence of genetic context in choosing which copy of a duplicated gene has an elevated substitution rate. In a representative dataset of 193 mammalian genomes, we found strong evidence that gene copies pasted into new genomic locations tend to evolve faster than the corresponding copies in ancestral locations, suggesting an important role for the regulatory environment. The asymmetry in evolutionary rates of duplicated genes persists even for very large multigenic duplications, up to the scale of megabases, indicating that regulatory interactions frequently reach farther than previously thought. 
    more » « less
  4. Yeh, Shu-Dan (Ed.)
    Abstract A thorough understanding of adaptation and speciation requires model organisms with both a history of ecological and phenotypic study as well as a complete set of genomic resources. In particular, high-quality genome assemblies of ecological model organisms are needed to assess the evolution of genome structure and its role in adaptation and speciation. Here, we generate new genomes of cactophilic Drosophila, a crucial model clade for understanding speciation and ecological adaptation in xeric environments. We generated chromosome-level genome assemblies and complete annotations for seven populations across Drosophila mojavensis, Drosophila arizonae, and Drosophila navojoa. We use these data first to establish the most robust phylogeny for this clade to date, and to assess patterns of molecular evolution across the phylogeny, showing concordance with a priori hypotheses regarding adaptive genes in this system. We then show that structural evolution occurs at constant rate across the phylogeny, varies by chromosome, and is correlated with molecular evolution. These results advance the understanding of the D. mojavensis clade by demonstrating core evolutionary genetic patterns and integrating those patterns to generate new gene-level hypotheses regarding adaptation. Our data are presented in a new public database (cactusflybase.arizona.edu), providing one of the most in-depth resources for the analysis of inter- and intraspecific evolutionary genomic data. Furthermore, we anticipate that the patterns of structural evolution identified here will serve as a baseline for future comparative studies to identify the factors that influence the evolution of genome structure across taxa. 
    more » « less
  5. Abstract The assembly of genomes from pooled samples of genetically heterogenous samples of conspecifics remains challenging. In this study, we show that high‐quality genome assemblies can be produced from samples of multiple wild‐caught individuals. We sequenced DNA extracted from a pooled sample of conspecific herbivorous insects (Hemiptera: Miridae:Tupiocoris notatus) acquired from a greenhouse infestation in Tucson, Arizona (in the range of 30–100 individuals; 0.5 mL tissue by volume) using PacBio highly accurate long reads (HiFi). The initial assembly contained multiple haplotigs (>85% BUSCOs duplicated), but duplicate contigs could be easily purged to reveal a highly complete assembly (95.6% BUSCO, 4.4% duplicated) that is highly contiguous by short‐read assembly standards (N50 = 675 kb; Largest contig = 4.3 Mb). We then used our assembly as the basis for a genome‐guided differential expression study of host plant‐specific transcriptional responses. We found thousands of genes (N = 4982) to be differentially expressed between our new data from individuals feeding onDatura wrightii(Solanaceae) and existing RNA‐seq data fromNicotiana attenuata(Solanaceae)‐fed individuals. We identified many of these genes as previously documented detoxification genes such as glutathione‐S‐transferases, cytochrome P450s, and UDP‐glucosyltransferases. Together our results show that long‐read sequencing of pooled samples can provide a cost‐effective genome assembly option for small insects and can provide insights into the genetic mechanisms underlying interactions between plants and herbivorous pests. 
    more » « less