skip to main content


Title: The Evolution of Widespread Recombination Suppression on the Dwarf Hamster ( Phodopus ) X Chromosome
Abstract The X chromosome of therian mammals shows strong conservation among distantly related species, limiting insights into the distinct selective processes that have shaped sex chromosome evolution. We constructed a chromosome-scale de novo genome assembly for the Siberian dwarf hamster (Phodopus sungorus), a species reported to show extensive recombination suppression across an entire arm of the X chromosome. Combining a physical genome assembly based on shotgun and long-range proximity ligation sequencing with a dense genetic map, we detected widespread suppression of female recombination across ∼65% of the Phodopus X chromosome. This region of suppressed recombination likely corresponds to the Xp arm, which has previously been shown to be highly heterochromatic. Using additional sequencing data from two closely related species (P. campbelli and P. roborovskii), we show that recombination suppression on Xp appears to be independent of major structural rearrangements. The suppressed Xp arm was enriched for several transposable element families and de-enriched for genes primarily expressed in placenta, but otherwise showed similar gene densities, expression patterns, and rates of molecular evolution when compared to the recombinant Xq arm. Phodopus Xp gene content and order was also broadly conserved relative to the more distantly related rat X chromosome. These data suggest that widespread suppression of recombination has likely evolved through the transient induction of facultative heterochromatin on the Phodopus Xp arm without major changes in chromosome structure or genetic content. Thus, substantial changes in the recombination landscape have so far had relatively subtle influences on patterns of X-linked molecular evolution in these species.  more » « less
Award ID(s):
1754096
NSF-PAR ID:
10443713
Author(s) / Creator(s):
; ; ; ; ; ; ;
Editor(s):
Mank, Judith
Date Published:
Journal Name:
Genome Biology and Evolution
Volume:
14
Issue:
6
ISSN:
1759-6653
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. INTRODUCTION Transposable elements (TEs), repeat expansions, and repeat-mediated structural rearrangements play key roles in chromosome structure and species evolution, contribute to human genetic variation, and substantially influence human health through copy number variants, structural variants, insertions, deletions, and alterations to gene transcription and splicing. Despite their formative role in genome stability, repetitive regions have been relegated to gaps and collapsed regions in human genome reference GRCh38 owing to the technological limitations during its development. The lack of linear sequence in these regions, particularly in centromeres, resulted in the inability to fully explore the repeat content of the human genome in the context of both local and regional chromosomal environments. RATIONALE Long-read sequencing supported the complete, telomere-to-telomere (T2T) assembly of the pseudo-haploid human cell line CHM13. This resource affords a genome-scale assessment of all human repetitive sequences, including TEs and previously unknown repeats and satellites, both within and outside of gaps and collapsed regions. Additionally, a complete genome enables the opportunity to explore the epigenetic and transcriptional profiles of these elements that are fundamental to our understanding of chromosome structure, function, and evolution. Comparative analyses reveal modes of repeat divergence, evolution, and expansion or contraction with locus-level resolution. RESULTS We implemented a comprehensive repeat annotation workflow using previously known human repeats and de novo repeat modeling followed by manual curation, including assessing overlaps with gene annotations, segmental duplications, tandem repeats, and annotated repeats. Using this method, we developed an updated catalog of human repetitive sequences and refined previous repeat annotations. We discovered 43 previously unknown repeats and repeat variants and characterized 19 complex, composite repetitive structures, which often carry genes, across T2T-CHM13. Using precision nuclear run-on sequencing (PRO-seq) and CpG methylated sites generated from Oxford Nanopore Technologies long-read sequencing data, we assessed RNA polymerase engagement across retroelements genome-wide, revealing correlations between nascent transcription, sequence divergence, CpG density, and methylation. These analyses were extended to evaluate RNA polymerase occupancy for all repeats, including high-density satellite repeats that reside in previously inaccessible centromeric regions of all human chromosomes. Moreover, using both mapping-dependent and mapping-independent approaches across early developmental stages and a complete cell cycle time series, we found that engaged RNA polymerase across satellites is low; in contrast, TE transcription is abundant and serves as a boundary for changes in CpG methylation and centromere substructure. Together, these data reveal the dynamic relationship between transcriptionally active retroelement subclasses and DNA methylation, as well as potential mechanisms for the derivation and evolution of new repeat families and composite elements. Focusing on the emerging T2T-level assembly of the HG002 X chromosome, we reveal that a high level of repeat variation likely exists across the human population, including composite element copy numbers that affect gene copy number. Additionally, we highlight the impact of repeats on the structural diversity of the genome, revealing repeat expansions with extreme copy number differences between humans and primates while also providing high-confidence annotations of retroelement transduction events. CONCLUSION The comprehensive repeat annotations and updated repeat models described herein serve as a resource for expanding the compendium of human genome sequences and reveal the impact of specific repeats on the human genome. In developing this resource, we provide a methodological framework for assessing repeat variation within and between human genomes. The exhaustive assessment of the transcriptional landscape of repeats, at both the genome scale and locally, such as within centromeres, sets the stage for functional studies to disentangle the role transcription plays in the mechanisms essential for genome stability and chromosome segregation. Finally, our work demonstrates the need to increase efforts toward achieving T2T-level assemblies for nonhuman primates and other species to fully understand the complexity and impact of repeat-derived genomic innovations that define primate lineages, including humans. Telomere-to-telomere assembly of CHM13 supports repeat annotations and discoveries. The human reference T2T-CHM13 filled gaps and corrected collapsed regions (triangles) in GRCh38. Combining long read–based methylation calls, PRO-seq, and multilevel computational methods, we provide a compendium of human repeats, define retroelement expression and methylation profiles, and delineate locus-specific sites of nascent transcription genome-wide, including previously inaccessible centromeres. SINE, short interspersed element; SVA, SINE–variable number tandem repeat– Alu ; LINE, long interspersed element; LTR, long terminal repeat; TSS, transcription start site; pA, xxxxxxxxxxxxxxxx. 
    more » « less
  2. Abstract Background

    The Aldabra giant tortoise (Aldabrachelys gigantea) is one of only two giant tortoise species left in the world. The species is endemic to Aldabra Atoll in Seychelles and is listed as Vulnerable on the International Union for Conservation of Nature Red List (v2.3) due to its limited distribution and threats posed by climate change. Genomic resources for A. gigantea are lacking, hampering conservation efforts for both wild and ex situpopulations. A high-quality genome would also open avenues to investigate the genetic basis of the species’ exceptionally long life span.

    Findings

    We produced the first chromosome-level de novo genome assembly of A. gigantea using PacBio High-Fidelity sequencing and high-throughput chromosome conformation capture. We produced a 2.37-Gbp assembly with a scaffold N50 of 148.6 Mbp and a resolution into 26 chromosomes. RNA sequencing–assisted gene model prediction identified 23,953 protein-coding genes and 1.1 Gbp of repetitive sequences. Synteny analyses among turtle genomes revealed high levels of chromosomal collinearity even among distantly related taxa. To assess the utility of the high-quality assembly for species conservation, we performed a low-coverage resequencing of 30 individuals from wild populations and two zoo individuals. Our genome-wide population structure analyses detected genetic population structure in the wild and identified the most likely origin of the zoo-housed individuals. We further identified putatively deleterious mutations to be monitored.

    Conclusions

    We establish a high-quality chromosome-level reference genome for A. gigantea and one of the most complete turtle genomes available. We show that low-coverage whole-genome resequencing, for which alignment to the reference genome is a necessity, is a powerful tool to assess the population structure of the wild population and reveal the geographic origins of ex situ individuals relevant for genetic diversity management and rewilding efforts.

     
    more » « less
  3. Sex chromosome dosage compensation is a model to understand the coordinated evolution of transcription; however, the advanced age of the sex chromosomes in model systems makes it difficult to study how the complex regulatory mechanisms underlying chromosome-wide dosage compensation can evolve. The sex chromosomes ofPoecilia pictahave undergone recent and rapid divergence, resulting in widespread gene loss on the male Y, coupled with complete X Chromosome dosage compensation, the first case reported in a fish. The recent de novo origin of dosage compensation presents a unique opportunity to understand the genetic and evolutionary basis of coordinated chromosomal gene regulation. By combining a new chromosome-level assembly ofP. pictawith whole-genome bisulfite sequencing and RNA-seq data, we determine that the YY1 transcription factor (YY1) DNA binding motif is associated with male-specific hypomethylated regions on the X, but not the autosomes. These YY1 motifs are the result of a recent and rapid repetitive element expansion on theP. pictaX Chromosome, which is absent in closely related species that lack dosage compensation. Taken together, our results present compelling support that a disruptive wave of repetitive element insertions carrying YY1 motifs resulted in the remodeling of the X Chromosome epigenomic landscape and the rapid de novo origin of a dosage compensation system.

     
    more » « less
  4. Genomic structural variants (SVs) can play important roles in adaptation and speciation. Yet the overall fitness effects of SVs are poorly understood, partly because accurate population-level identification of SVs requires multiple high-quality genome assemblies. Here, we use 31 chromosome-scale, haplotype-resolved genome assemblies ofTheobroma cacao—an outcrossing, long-lived tree species that is the source of chocolate—to investigate the fitness consequences of SVs in natural populations. Among the 31 accessions, we find over 160,000 SVs, which together cover eight times more of the genome than single-nucleotide polymorphisms and short indels (125 versus 15 Mb). Our results indicate that a vast majority of these SVs are deleterious: they segregate at low frequencies and are depleted from functional regions of the genome. We show that SVs influence gene expression, which likely impairs gene function and contributes to the detrimental effects of SVs. We also provide empirical support for a theoretical prediction that SVs, particularly inversions, increase genetic load through the accumulation of deleterious nucleotide variants as a result of suppressed recombination. Despite the overall detrimental effects, we identify individual SVs bearing signatures of local adaptation, several of which are associated with genes differentially expressed between populations. Genes involved in pathogen resistance are strongly enriched among these candidates, highlighting the contribution of SVs to this important local adaptation trait. Beyond revealing empirical evidence for the evolutionary importance of SVs, these 31 de novo assemblies provide a valuable resource for genetic and breeding studies inT.cacao.

     
    more » « less
  5. Abstract

    Selfish genetic elements that manipulate gametogenesis to achieve a transmission advantage are known as meiotic drivers.Sex‐ratioX chromosomes (SR) are meiotic drivers that prevent the maturation of Y‐bearing sperm in male carriers to result in the production of mainly female progeny. The spread of an SR chromosome can affect host genetic diversity and genome evolution, and can even cause host extinction if it reaches sufficiently high prevalence. Meiotic drivers have evolved independently many times, though only in a few cases is the underlying genetic mechanism known. In this study we use a combination of transcriptomics and population genetics to identify widespread expression differences between the standard (ST) andsex‐ratio(SR) X chromosomes of the flyDrosophila neotestacea.We found the X chromosome is enriched for differentially expressed transcripts and that many of these X‐linked differentially expressed transcripts had elevatedKa/Ksvalues between ST and SR, indicative of potential functional differences. We identified a set of candidate transcripts, including a testis‐specific, X‐linked duplicate of the nuclear transport geneimportin‐α2that is overexpressed in SR.We find suggestions of positive selection in the lineage leading to the duplicate and that its molecular evolutionary patterns are consistent with relaxed purifying selection in ST. As these patterns are consistent with involvement in the mechanism of drive in this species, this duplicate is a strong candidate worthy of further functional investigation. Nuclear transport may be a common target for genetic conflict, as the mechanism of the autosomalSegregation Distorterdrive system inD. melanogasterinvolves the same pathway.

     
    more » « less