skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: High levels of intra-strain structural variation in Drosophila simulans X pericentric heterochromatin
Abstract Large genome structural variations can impact genome regulation and integrity. Repeat-rich regions like pericentric heterochromatin are vulnerable to structural rearrangements although we know little about how often these rearrangements occur over evolutionary time. Repetitive genome regions are particularly difficult to study with genomic approaches, as they are missing from most genome assemblies. However, cytogenetic approaches offer a direct way to detect large rearrangements involving pericentric heterochromatin. Here, we use a cytogenetic approach to reveal large structural rearrangements associated with the X pericentromeric region of Drosophila simulans. These rearrangements involve large blocks of satellite DNA—the 500-bp and Rsp-like satellites—which colocalize in the X pericentromeric heterochromatin. We find that this region is polymorphic not only among different strains, but between isolates of the same strain from different labs, and even within individual isolates. On the one hand, our observations raise questions regarding the potential impact of such variation at the phenotypic level and our ability to control for such genetic variability. On the other hand, this highlights the very rapid turnover of the pericentric heterochromatin most likely associated with genomic instability of the X pericentromere. It represents a unique opportunity to study the dynamics of pericentric heterochromatin, the evolution of associated satellites on a very short time scale, and to better understand how structural variation arises.  more » « less
Award ID(s):
1844693
PAR ID:
10476787
Author(s) / Creator(s):
;
Editor(s):
Bateman, J
Publisher / Repository:
Oxford
Date Published:
Journal Name:
GENETICS
ISSN:
1943-2631
Subject(s) / Keyword(s):
structural variation pericentromeric heterochromatin satellite repeats Drosophila
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Pericentromeric heterochromatin in Drosophila generally consists of repetitive DNA, forming the environment associated with gene silencing. Despite the expanding knowledge of the impact of transposable elements (TEs) on the host genome, little is known about the evolution of pericentromeric heterochromatin, its structural composition, and age. During the evolution of the Drosophilidae, hundreds of genes have become embedded within pericentromeric regions yet retained activity. We investigated a pericentromeric heterochromatin fragment found in D. virilis and related species, describing the evolution of genes in this region and the age of TE invasion. Regardless of the heterochromatic environment, the amino acid composition of the genes is under purifying selection. However, the selective pressure affects parts of genes in varying degrees, resulting in expansion of gene introns due to TEs invasion. According to the divergence of TEs, the pericentromeric heterochromatin of the species of virilis group began to form more than 20 million years ago by invasions of retroelements, miniature inverted repeat transposable elements (MITEs), and Helitrons. Importantly, invasions into the heterochromatin continue to occur by TEs that fall under the scope of piRNA silencing. Thus, the pericentromeric heterochromatin, in spite of its ability to induce silencing, has the means for being dynamic, incorporating the regions of active transcription. 
    more » « less
  2. Summary Eukaryotic genomes harbor many forms of variation, including nucleotide diversity and structural polymorphisms, which experience natural selection and contribute to genome evolution and biodiversity. However, harnessing this variation for agriculture hinges on our ability to detect, quantify, catalog, and utilize genetic diversity.Here, we explore seven complete genomes of the emerging biofuel crop pennycress (Thlaspi arvense) drawn from across the species’s current genetic diversity to catalogue variation in genome structure and content.Across this new pangenome resource, we find contrasting evolutionary modes in different genomic regions. Gene-poor, repeat-rich pericentromeric regions experience frequent rearrangements, including repeated centromere repositioning. In contrast, conserved gene-dense chromosome arms maintain large-scale synteny across accessions, even in fast-evolving immune genes where microsynteny breaks down across species but the macrosynteny of gene cluster positioning is maintained.Our findings highlight that multiple elements of the genome experience dynamic evolution that conserves functional content on the chromosome scale but allows rearrangement and presence-absence variation on a local scale. This diversity is invisible to classical reference-based approaches and highlights the strength and utility of pangenomic resources. These results provide a valuable case study of rapid genomic structural evolution within a species and powerful resources for crop development in an emerging biofuel crop. 
    more » « less
  3. Pericentromeric heterochromatin is mostly composed of repetitive DNA sequences prone to aberrant recombination. Cells have developed highly specialized mechanisms to enable ‘safe’ homologous recombination (HR) repair while preventing aberrant recombination in this domain. Understanding heterochromatin repair responses is essential to understanding the critical mechanisms responsible for genome integrity and tumor suppression. Here, we review the tools, approaches, and methods currently available to investigate double-strand break (DSB) repair in pericentromeric regions, and also suggest how technologies recently developed for euchromatin repair studies can be adapted to characterize responses in heterochromatin. With this ever-growing toolkit, we are witnessing exciting progress in our understanding of how the ‘dark matter’ of the genome is repaired, greatly improving our understanding of genome stability mechanisms. 
    more » « less
  4. Engel, P (Ed.)
    Abstract Lactobacillaceae are an important family of lactic acid bacteria that play key roles in the gut microbiome of many animal species. In the honey bee (Apis mellifera) gut microbiome, many species of Lactobacillaceae are found, and there is functionally important strain-level variation in the bacteria. In this study, we completed whole-genome sequencing of 3 unique Lactobacillaceae isolates collected from hives in Virginia, USA. Using 107 genomes of known bee-associated Lactobacillaceae and Limosilactobacillus reuteri as an outgroup, the phylogenetics of the 3 isolates was assessed, and these isolates were identified as novel strains of Apilactobacillus kunkeei, Lactobacillus kullabergensis, and Bombilactobacillus mellis. Genome rearrangements, conserved orthologous genes (COG) categories and potential prophage regions were identified across the 3 novel strains. The new A. kunkeei strain was enriched in genes related to replication, recombination and repair, the L. kullabergensis strain was enriched for carbohydrate transport, and the B. mellis strain was enriched in transcription or transcriptional regulation and in some genes with unknown functions. Prophage regions were identified in the A. kunkeei and L. kullabergensis isolates. These new bee-associated strains add to our growing knowledge of the honey bee gut microbiome, and to Lactobacillaceae genomics more broadly. 
    more » « less
  5. INTRODUCTION To faithfully distribute genetic material to daughter cells during cell division, spindle fibers must couple to DNA by means of a structure called the kinetochore, which assembles at each chromosome’s centromere. Human centromeres are located within large arrays of tandemly repeated DNA sequences known as alpha satellite (αSat), which often span millions of base pairs on each chromosome. Arrays of αSat are frequently surrounded by other types of tandem satellite repeats, which have poorly understood functions, along with nonrepetitive sequences, including transcribed genes. Previous genome sequencing efforts have been unable to generate complete assemblies of satellite-rich regions because of their scale and repetitive nature, limiting the ability to study their organization, variation, and function. RATIONALE Pericentromeric and centromeric (peri/centromeric) satellite DNA sequences have remained almost entirely missing from the assembled human reference genome for the past 20 years. Using a complete, telomere-to-telomere (T2T) assembly of a human genome, we developed and deployed tailored computational approaches to reveal the organization and evolutionary patterns of these satellite arrays at both large and small length scales. We also performed experiments to map precisely which αSat repeats interact with kinetochore proteins. Last, we compared peri/centromeric regions among multiple individuals to understand how these sequences vary across diverse genetic backgrounds. RESULTS Satellite repeats constitute 6.2% of the T2T-CHM13 genome assembly, with αSat representing the single largest component (2.8% of the genome). By studying the sequence relationships of αSat repeats in detail across each centromere, we found genome-wide evidence that human centromeres evolve through “layered expansions.” Specifically, distinct repetitive variants arise within each centromeric region and expand through mechanisms that resemble successive tandem duplications, whereas older flanking sequences shrink and diverge over time. We also revealed that the most recently expanded repeats within each αSat array are more likely to interact with the inner kinetochore protein Centromere Protein A (CENP-A), which coincides with regions of reduced CpG methylation. This suggests a strong relationship between local satellite repeat expansion, kinetochore positioning, and DNA hypomethylation. Furthermore, we uncovered large and unexpected structural rearrangements that affect multiple satellite repeat types, including active centromeric αSat arrays. Last, by comparing sequence information from nearly 1600 individuals’ X chromosomes, we observed that individuals with recent African ancestry possess the greatest genetic diversity in the region surrounding the centromere, which sometimes contains a predominantly African αSat sequence variant. CONCLUSION The genetic and epigenetic properties of centromeres are closely interwoven through evolution. These findings raise important questions about the specific molecular mechanisms responsible for the relationship between inner kinetochore proteins, DNA hypomethylation, and layered αSat expansions. Even more questions remain about the function and evolution of non-αSat repeats. To begin answering these questions, we have produced a comprehensive encyclopedia of peri/centromeric sequences in a human genome, and we demonstrated how these regions can be studied with modern genomic tools. Our work also illuminates the rich genetic variation hidden within these formerly missing regions of the genome, which may contribute to health and disease. This unexplored variation underlines the need for more T2T human genome assemblies from genetically diverse individuals. Gapless assemblies illuminate centromere evolution. ( Top ) The organization of peri/centromeric satellite repeats. ( Bottom left ) A schematic portraying (i) evidence for centromere evolution through layered expansions and (ii) the localization of inner-kinetochore proteins in the youngest, most recently expanded repeats, which coincide with a region of DNA hypomethylation. ( Bottom right ) An illustration of the global distribution of chrX centromere haplotypes, showing increased diversity in populations with recent African ancestry. 
    more » « less