skip to main content


Title: Koala methylomes reveal divergent and conserved DNA methylation signatures of X chromosome regulation
X chromosome inactivation (XCI) mediated by differential DNA methylation between sexes is an iconic example of epigenetic regulation. Although XCI is shared between eutherians and marsupials, the role of DNA methylation in marsupial XCI remains contested. Here, we examine genome-wide signatures of DNA methylation across fives tissues from a male and female koala ( Phascolarctos cinereus ), and present the first whole-genome, multi-tissue marsupial ‘methylome atlas’. Using these novel data, we elucidate divergent versus common features of representative marsupial and eutherian DNA methylation. First, tissue-specific differential DNA methylation in koalas primarily occurs in gene bodies. Second, females show significant global reduction (hypomethylation) of X chromosome DNA methylation compared to males. We show that this pattern is also observed in eutherians. Third, on average, promoter DNA methylation shows little difference between male and female koala X chromosomes, a pattern distinct from that of eutherians. Fourth, the sex-specific DNA methylation landscape upstream of Rsx , the primary lnc RNA associated with marsupial XCI, is consistent with the epigenetic regulation of female-specific (and presumably inactive X chromosome-specific) expression. Finally, we use the prominent female X chromosome hypomethylation and classify 98 previously unplaced scaffolds as X-linked, contributing an additional 14.6 Mb (21.5%) to genomic data annotated as the koala X chromosome. Our work demonstrates evolutionarily divergent pathways leading to functionally conserved patterns of XCI in two deep branches of mammals.  more » « less
Award ID(s):
1818288
NSF-PAR ID:
10278192
Author(s) / Creator(s):
; ; ; ; ; ;
Date Published:
Journal Name:
Proceedings of the Royal Society B: Biological Sciences
Volume:
288
Issue:
1945
ISSN:
0962-8452
Page Range / eLocation ID:
20202244
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. INTRODUCTION To faithfully distribute genetic material to daughter cells during cell division, spindle fibers must couple to DNA by means of a structure called the kinetochore, which assembles at each chromosome’s centromere. Human centromeres are located within large arrays of tandemly repeated DNA sequences known as alpha satellite (αSat), which often span millions of base pairs on each chromosome. Arrays of αSat are frequently surrounded by other types of tandem satellite repeats, which have poorly understood functions, along with nonrepetitive sequences, including transcribed genes. Previous genome sequencing efforts have been unable to generate complete assemblies of satellite-rich regions because of their scale and repetitive nature, limiting the ability to study their organization, variation, and function. RATIONALE Pericentromeric and centromeric (peri/centromeric) satellite DNA sequences have remained almost entirely missing from the assembled human reference genome for the past 20 years. Using a complete, telomere-to-telomere (T2T) assembly of a human genome, we developed and deployed tailored computational approaches to reveal the organization and evolutionary patterns of these satellite arrays at both large and small length scales. We also performed experiments to map precisely which αSat repeats interact with kinetochore proteins. Last, we compared peri/centromeric regions among multiple individuals to understand how these sequences vary across diverse genetic backgrounds. RESULTS Satellite repeats constitute 6.2% of the T2T-CHM13 genome assembly, with αSat representing the single largest component (2.8% of the genome). By studying the sequence relationships of αSat repeats in detail across each centromere, we found genome-wide evidence that human centromeres evolve through “layered expansions.” Specifically, distinct repetitive variants arise within each centromeric region and expand through mechanisms that resemble successive tandem duplications, whereas older flanking sequences shrink and diverge over time. We also revealed that the most recently expanded repeats within each αSat array are more likely to interact with the inner kinetochore protein Centromere Protein A (CENP-A), which coincides with regions of reduced CpG methylation. This suggests a strong relationship between local satellite repeat expansion, kinetochore positioning, and DNA hypomethylation. Furthermore, we uncovered large and unexpected structural rearrangements that affect multiple satellite repeat types, including active centromeric αSat arrays. Last, by comparing sequence information from nearly 1600 individuals’ X chromosomes, we observed that individuals with recent African ancestry possess the greatest genetic diversity in the region surrounding the centromere, which sometimes contains a predominantly African αSat sequence variant. CONCLUSION The genetic and epigenetic properties of centromeres are closely interwoven through evolution. These findings raise important questions about the specific molecular mechanisms responsible for the relationship between inner kinetochore proteins, DNA hypomethylation, and layered αSat expansions. Even more questions remain about the function and evolution of non-αSat repeats. To begin answering these questions, we have produced a comprehensive encyclopedia of peri/centromeric sequences in a human genome, and we demonstrated how these regions can be studied with modern genomic tools. Our work also illuminates the rich genetic variation hidden within these formerly missing regions of the genome, which may contribute to health and disease. This unexplored variation underlines the need for more T2T human genome assemblies from genetically diverse individuals. Gapless assemblies illuminate centromere evolution. ( Top ) The organization of peri/centromeric satellite repeats. ( Bottom left ) A schematic portraying (i) evidence for centromere evolution through layered expansions and (ii) the localization of inner-kinetochore proteins in the youngest, most recently expanded repeats, which coincide with a region of DNA hypomethylation. ( Bottom right ) An illustration of the global distribution of chrX centromere haplotypes, showing increased diversity in populations with recent African ancestry. 
    more » « less
  2. null (Ed.)
    Abstract DNA methylation is a critical regulatory mechanism implicated in development, learning, memory, and disease in the human brain. Here we have elucidated DNA methylation changes during recent human brain evolution. We demonstrate dynamic evolutionary trajectories of DNA methylation in cell-type and cytosine-context specific manner. Specifically, DNA methylation in non-CG context, namely CH methylation, has increased (hypermethylation) in neuronal gene bodies during human brain evolution, contributing to human-specific down-regulation of genes and co-expression modules. The effects of CH hypermethylation is particularly pronounced in early development and neuronal subtypes. In contrast, DNA methylation in CG context shows pronounced reduction (hypomethylation) in human brains, notably in cis-regulatory regions, leading to upregulation of downstream genes. We show that the majority of differential CG methylation between neurons and oligodendrocytes originated before the divergence of hominoids and catarrhine monkeys, and harbors strong signal for genetic risk for schizophrenia. Remarkably, a substantial portion of differential CG methylation between neurons and oligodendrocytes emerged in the human lineage since the divergence from the chimpanzee lineage and carries significant genetic risk for schizophrenia. Therefore, recent epigenetic evolution of human cortex has shaped the cellular regulatory landscape and contributed to the increased vulnerability to neuropsychiatric diseases. 
    more » « less
  3. Abstract

    Much of our knowledge on regulatory impacts of DNA methylation has come from laboratory‐bred model organisms, which may not exhibit the full extent of variation found in wild populations. Here, we investigated naturally‐occurring variation in DNA methylation in a wild avian species, the white‐throated sparrow (Zonotrichia albicollis). This species offers exceptional opportunities for studying the link between genetic differentiation and phenotypic traits because of a nonrecombining chromosome pair linked to both plumage and behavioural phenotypes. Using novel single‐nucleotide resolution methylation maps and gene expression data, we show that DNA methylation and the expression of DNA methyltransferases are significantly higher in adults than in nestlings. Genes for which DNA methylation varied between nestlings and adults were implicated in development and cell differentiation and were located throughout the genome. In contrast, differential methylation between plumage morphs was concentrated in the nonrecombining chromosome pair. Interestingly, a large number of CpGs on the nonrecombining chromosome, localized to transposable elements, have undergone dramatic loss of DNA methylation since the split of the ZAL2 and ZAL2mchromosomes. Changes in methylation predicted changes in gene expression for both chromosomes. In summary, we demonstrate changes in genome‐wide DNA methylation that are associated with development and with specific functional categories of genes in white‐throated sparrows. Moreover, we observe substantial DNA methylation reprogramming associated with the suppression of recombination, with implications for genome integrity and gene expression divergence. These results offer an unprecedented view of ongoing epigenetic reprogramming in a wild population.

     
    more » « less
  4. INTRODUCTION Transposable elements (TEs), repeat expansions, and repeat-mediated structural rearrangements play key roles in chromosome structure and species evolution, contribute to human genetic variation, and substantially influence human health through copy number variants, structural variants, insertions, deletions, and alterations to gene transcription and splicing. Despite their formative role in genome stability, repetitive regions have been relegated to gaps and collapsed regions in human genome reference GRCh38 owing to the technological limitations during its development. The lack of linear sequence in these regions, particularly in centromeres, resulted in the inability to fully explore the repeat content of the human genome in the context of both local and regional chromosomal environments. RATIONALE Long-read sequencing supported the complete, telomere-to-telomere (T2T) assembly of the pseudo-haploid human cell line CHM13. This resource affords a genome-scale assessment of all human repetitive sequences, including TEs and previously unknown repeats and satellites, both within and outside of gaps and collapsed regions. Additionally, a complete genome enables the opportunity to explore the epigenetic and transcriptional profiles of these elements that are fundamental to our understanding of chromosome structure, function, and evolution. Comparative analyses reveal modes of repeat divergence, evolution, and expansion or contraction with locus-level resolution. RESULTS We implemented a comprehensive repeat annotation workflow using previously known human repeats and de novo repeat modeling followed by manual curation, including assessing overlaps with gene annotations, segmental duplications, tandem repeats, and annotated repeats. Using this method, we developed an updated catalog of human repetitive sequences and refined previous repeat annotations. We discovered 43 previously unknown repeats and repeat variants and characterized 19 complex, composite repetitive structures, which often carry genes, across T2T-CHM13. Using precision nuclear run-on sequencing (PRO-seq) and CpG methylated sites generated from Oxford Nanopore Technologies long-read sequencing data, we assessed RNA polymerase engagement across retroelements genome-wide, revealing correlations between nascent transcription, sequence divergence, CpG density, and methylation. These analyses were extended to evaluate RNA polymerase occupancy for all repeats, including high-density satellite repeats that reside in previously inaccessible centromeric regions of all human chromosomes. Moreover, using both mapping-dependent and mapping-independent approaches across early developmental stages and a complete cell cycle time series, we found that engaged RNA polymerase across satellites is low; in contrast, TE transcription is abundant and serves as a boundary for changes in CpG methylation and centromere substructure. Together, these data reveal the dynamic relationship between transcriptionally active retroelement subclasses and DNA methylation, as well as potential mechanisms for the derivation and evolution of new repeat families and composite elements. Focusing on the emerging T2T-level assembly of the HG002 X chromosome, we reveal that a high level of repeat variation likely exists across the human population, including composite element copy numbers that affect gene copy number. Additionally, we highlight the impact of repeats on the structural diversity of the genome, revealing repeat expansions with extreme copy number differences between humans and primates while also providing high-confidence annotations of retroelement transduction events. CONCLUSION The comprehensive repeat annotations and updated repeat models described herein serve as a resource for expanding the compendium of human genome sequences and reveal the impact of specific repeats on the human genome. In developing this resource, we provide a methodological framework for assessing repeat variation within and between human genomes. The exhaustive assessment of the transcriptional landscape of repeats, at both the genome scale and locally, such as within centromeres, sets the stage for functional studies to disentangle the role transcription plays in the mechanisms essential for genome stability and chromosome segregation. Finally, our work demonstrates the need to increase efforts toward achieving T2T-level assemblies for nonhuman primates and other species to fully understand the complexity and impact of repeat-derived genomic innovations that define primate lineages, including humans. Telomere-to-telomere assembly of CHM13 supports repeat annotations and discoveries. The human reference T2T-CHM13 filled gaps and corrected collapsed regions (triangles) in GRCh38. Combining long read–based methylation calls, PRO-seq, and multilevel computational methods, we provide a compendium of human repeats, define retroelement expression and methylation profiles, and delineate locus-specific sites of nascent transcription genome-wide, including previously inaccessible centromeres. SINE, short interspersed element; SVA, SINE–variable number tandem repeat– Alu ; LINE, long interspersed element; LTR, long terminal repeat; TSS, transcription start site; pA, xxxxxxxxxxxxxxxx. 
    more » « less
  5. Abstract Background

    Environmental fluctuation during embryonic and fetal development can permanently alter an organism’s morphology, physiology, and behaviour. This phenomenon, known as developmental plasticity, is particularly relevant to reptiles that develop in subterranean nests with variable oxygen tensions. Previous work has shown hypoxia permanently alters the cardiovascular system of snapping turtles and may improve cardiac anoxia tolerance later in life. The mechanisms driving this process are unknown but may involve epigenetic regulation of gene expression via DNA methylation. To test this hypothesis, we assessed in situ cardiac performance during 2 h of acute anoxia in juvenile turtles previously exposed to normoxia (21% oxygen) or hypoxia (10% oxygen) during embryogenesis. Next, we analysed DNA methylation and gene expression patterns in turtles from the same cohorts using whole genome bisulfite sequencing, which represents the first high-resolution investigation of DNA methylation patterns in any reptilian species.

    Results

    Genome-wide correlations between CpG and CpG island methylation and gene expression patterns in the snapping turtle were consistent with patterns observed in mammals. As hypothesized, developmental hypoxia increased juvenile turtle cardiac anoxia tolerance and programmed DNA methylation and gene expression patterns. Programmed differences in expression of genes such asSCN5Amay account for differences in heart rate, while genes such asTNNT2andTPM3may underlie differences in calcium sensitivity and contractility of cardiomyocytes and cardiac inotropy. Finally, we identified putative transcription factor-binding sites in promoters and in differentially methylated CpG islands that suggest a model linking programming of DNA methylation during embryogenesis to differential gene expression and cardiovascular physiology later in life. Binding sites for hypoxia inducible factors (HIF1A, ARNT, and EPAS1) and key transcription factors activated by MAPK and BMP signaling (RREB1 and SMAD4) are implicated.

    Conclusions

    Our data strongly suggests that DNA methylation plays a conserved role in the regulation of gene expression in reptiles. We also show that embryonic hypoxia programs DNA methylation and gene expression patterns and that these changes are associated with enhanced cardiac anoxia tolerance later in life. Programming of cardiac anoxia tolerance has major ecological implications for snapping turtles, because these animals regularly exploit anoxic environments throughout their lifespan.

     
    more » « less