skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Controlling for Variable Transposition Rate with an Age-Adjusted Site Frequency Spectrum
Abstract Recognition of the important role of transposable elements (TEs) in eukaryotic genomes quickly led to a burgeoning literature modeling and estimating the effects of selection on TEs. Much of the empirical work on selection has focused on analyzing the site frequency spectrum (SFS) of TEs. But TE evolution differs from standard models in a number of ways that can impact the power and interpretation of the SFS. For example, rather than mutating under a clock-like model, transposition often occurs in bursts which can inflate particular frequency categories compared with expectations under a standard neutral model. If a TE burst has been recent, the excess of low-frequency polymorphisms can mimic the effect of purifying selection. Here, we investigate how transposition bursts affect the frequency distribution of TEs and the correlation between age and allele frequency. Using information on the TE age distribution, we propose an age-adjusted SFS to compare TEs and neutral polymorphisms to more effectively evaluate whether TEs are under selective constraints. We show that our approach can minimize instances of false inference of selective constraint, remains robust to simple demographic changes, and allows for a correct identification of even weak selection affecting TEs which experienced a transposition burst. The results presented here will help researchers working on TEs to more reliably identify the effects of selection on TEs without having to rely on the assumption of a constant transposition rate.  more » « less
Award ID(s):
1907343 1934384
PAR ID:
10321425
Author(s) / Creator(s):
; ; ;
Editor(s):
Betran, Esther
Date Published:
Journal Name:
Genome Biology and Evolution
Volume:
14
Issue:
2
ISSN:
1759-6653
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Suh, Alexander; Chapman, Tracey (Ed.)
    Abstract It is unclear how mobile DNA sequences (transposable elements, hereafter TEs) invade eukaryotic genomes and reach stable copy numbers, as transposition can decrease host fitness. This challenge is particularly stark early in the invasion of a TE family at which point hosts may lack the specialized machinery to repress the spread of these TEs. One possibility (in addition to the evolution of host regulation of TEs) is that TE families may evolve to preferentially insert into chromosomal regions that are less likely to impact host fitness. This may allow the mean TE copy number to grow while minimizing the risk for host population extinction. To test this, we constructed simulations to explore how the transposition probability and insertion preference of a TE family influence the evolution of mean TE copy number and host population size, allowing for extinction. We find that the effect of a TE family’s insertion preference depends on a host’s ability to regulate this TE family. Without host repression, a neutral insertion preference increases the frequency of and decreases the time to population extinction. With host repression, a preference for neutral insertions minimizes the cumulative deleterious load, increases population fitness, and, ultimately, avoids triggering an extinction vortex. 
    more » « less
  2. Background Transposable element (TE) polymorphisms are important components of population genetic variation. The functional impacts of TEs in gene regulation and generating genetic diversity have been observed in multiple species, but the frequency and magnitude of TE variation is under appreciated. Inexpensive and deep sequencing technology has made it affordable to apply population genetic methods to whole genomes with methods that identify single nucleotide and insertion/deletion polymorphisms. However, identifying TE polymorphisms, particularly transposition events or non-reference insertion sites can be challenging due to the repetitive nature of these sequences, which hamper both the sensitivity and specificity of analysis tools. Methods We have developed the tool RelocaTE2 for identification of TE insertion sites at high sensitivity and specificity. RelocaTE2 searches for known TE sequences in whole genome sequencing reads from second generation sequencing platforms such as Illumina. These sequence reads are used as seeds to pinpoint chromosome locations where TEs have transposed. RelocaTE2 detects target site duplication (TSD) of TE insertions allowing it to report TE polymorphism loci with single base pair precision. Results and Discussion The performance of RelocaTE2 is evaluated using both simulated and real sequence data. RelocaTE2 demonstrate high level of sensitivity and specificity, particularly when the sequence coverage is not shallow. In comparison to other tools tested, RelocaTE2 achieves the best balance between sensitivity and specificity. In particular, RelocaTE2 performs best in prediction of TSDs for TE insertions. Even in highly repetitive regions, such as those tested on rice chromosome 4, RelocaTE2 is able to report up to 95% of simulated TE insertions with less than 0.1% false positive rate using 10-fold genome coverage resequencing data. RelocaTE2 provides a robust solution to identify TE insertion sites and can be incorporated into analysis workflows in support of describing the complete genotype from light coverage genome sequencing. 
    more » « less
  3. null (Ed.)
    Abstract Transposable elements (TEs) are ubiquitous DNA segments capable of moving from one site to another within host genomes. The extant distributions of TEs in eukaryotic genomes have been shaped by both bona fide TE integration preferences in eukaryotic genomes and by selection following integration. Here, we compare TE target site distribution in host genomes using multiple de novo transposon insertion datasets in both plants and animals and compare them in the context of genome-wide transcriptional landscapes. We showcase two distinct types of transcription-associated TE targeting strategies that suggest a process of convergent evolution among eukaryotic TE families. The integration of two precision-targeting elements are specifically associated with initiation of RNA Polymerase II transcription of highly expressed genes, suggesting the existence of novel mechanisms of precision TE targeting in addition to passive targeting of open chromatin. We also highlight two features that can facilitate TE survival and rapid proliferation: tissue-specific transposition and minimization of negative impacts on nearby gene function due to precision targeting. 
    more » « less
  4. Andrews, B J (Ed.)
    Abstract Intact transposable elements (TEs) account for 65% of the maize genome and can impact gene function and regulation. Although TEs comprise the majority of the maize genome and affect important phenotypes, genome-wide patterns of TE polymorphisms in maize have only been studied in a handful of maize genotypes, due to the challenging nature of assessing highly repetitive sequences. We implemented a method to use short-read sequencing data from 509 diverse inbred lines to classify the presence/absence of 445,418 nonredundant TEs that were previously annotated in four genome assemblies including B73, Mo17, PH207, and W22. Different orders of TEs (i.e., LTRs, Helitrons, and TIRs) had different frequency distributions within the population. LTRs with lower LTR similarity were generally more frequent in the population than LTRs with higher LTR similarity, though high-frequency insertions with very high LTR similarity were observed. LTR similarity and frequency estimates of nested elements and the outer elements in which they insert revealed that most nesting events occurred very near the timing of the outer element insertion. TEs within genes were at higher frequency than those that were outside of genes and this is particularly true for those not inserted into introns. Many TE insertional polymorphisms observed in this population were tagged by SNP markers. However, there were also 19.9% of the TE polymorphisms that were not well tagged by SNPs (R2 < 0.5) that potentially represent information that has not been well captured in previous SNP-based marker-trait association studies. This study provides a population scale genome-wide assessment of TE variation in maize and provides valuable insight on variation in TEs in maize and factors that contribute to this variation. 
    more » « less
  5. Genomes of all characterized higher eukaryotes harbor examples of transposable element (TE) bursts—the rapid amplification of TE copies throughout a genome. Despite their prevalence, understanding how bursts diversify genomes requires the characterization of actively transposing TEs before insertion sites and structural rearrangements have been obscured by selection acting over evolutionary time. In this study, rice recombinant inbred lines (RILs), generated by crossing a bursting accession and the reference Nipponbare accession, were exploited to characterize the spread of the very active Ping / mPing family through a small population and the resulting impact on genome diversity. Comparative sequence analysis of 272 individuals led to the identification of over 14,000 new insertions of the mPing miniature inverted-repeat transposable element (MITE), with no evidence for silencing of the transposase-encoding Ping element. In addition to new insertions, Ping -encoded transposase was found to preferentially catalyze the excision of mPing loci tightly linked to a second mPing insertion. Similarly, structural variations, including deletion of rice exons or regulatory regions, were enriched for those with break points at one or both ends of linked mPing elements. Taken together, these results indicate that structural variations are generated during a TE burst as transposase catalyzes both the high copy numbers needed to distribute linked elements throughout the genome and the DNA cuts at the TE ends known to dramatically increase the frequency of recombination. 
    more » « less