skip to main content


Title: Controlling for Variable Transposition Rate with an Age-Adjusted Site Frequency Spectrum
Abstract Recognition of the important role of transposable elements (TEs) in eukaryotic genomes quickly led to a burgeoning literature modeling and estimating the effects of selection on TEs. Much of the empirical work on selection has focused on analyzing the site frequency spectrum (SFS) of TEs. But TE evolution differs from standard models in a number of ways that can impact the power and interpretation of the SFS. For example, rather than mutating under a clock-like model, transposition often occurs in bursts which can inflate particular frequency categories compared with expectations under a standard neutral model. If a TE burst has been recent, the excess of low-frequency polymorphisms can mimic the effect of purifying selection. Here, we investigate how transposition bursts affect the frequency distribution of TEs and the correlation between age and allele frequency. Using information on the TE age distribution, we propose an age-adjusted SFS to compare TEs and neutral polymorphisms to more effectively evaluate whether TEs are under selective constraints. We show that our approach can minimize instances of false inference of selective constraint, remains robust to simple demographic changes, and allows for a correct identification of even weak selection affecting TEs which experienced a transposition burst. The results presented here will help researchers working on TEs to more reliably identify the effects of selection on TEs without having to rely on the assumption of a constant transposition rate.  more » « less
Award ID(s):
1907343 1934384
NSF-PAR ID:
10321425
Author(s) / Creator(s):
; ; ;
Editor(s):
Betran, Esther
Date Published:
Journal Name:
Genome Biology and Evolution
Volume:
14
Issue:
2
ISSN:
1759-6653
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Transposable elements (TEs) are mobile genetic parasites that frequently invade new host genomes through horizontal transfer. Invading TEs often exhibit a burst of transposition, followed by reduced transposition rates as repression evolves in the host. We recreated the horizontal transfer of P-element DNA transposons into a Drosophila melanogaster host and followed the expansion of TE copies and evolution of host repression in replicate laboratory populations reared at different temperatures. We observed that while populations maintained at high temperatures rapidly go extinct after TE invasion, those maintained at lower temperatures persist, allowing for TE spread and the evolution of host repression. We also surprisingly discovered that invaded populations experienced recurrent insertion of P-elements into a specific long non-coding RNA, lncRNA:CR43651, and that these insertion alleles are segregating at unusually high frequency in experimental populations, indicative of positive selection. We propose that, in addition to driving the evolution of repression, transpositional bursts of invading TEs can drive molecular adaptation.

     
    more » « less
  2. Background Transposable element (TE) polymorphisms are important components of population genetic variation. The functional impacts of TEs in gene regulation and generating genetic diversity have been observed in multiple species, but the frequency and magnitude of TE variation is under appreciated. Inexpensive and deep sequencing technology has made it affordable to apply population genetic methods to whole genomes with methods that identify single nucleotide and insertion/deletion polymorphisms. However, identifying TE polymorphisms, particularly transposition events or non-reference insertion sites can be challenging due to the repetitive nature of these sequences, which hamper both the sensitivity and specificity of analysis tools. Methods We have developed the tool RelocaTE2 for identification of TE insertion sites at high sensitivity and specificity. RelocaTE2 searches for known TE sequences in whole genome sequencing reads from second generation sequencing platforms such as Illumina. These sequence reads are used as seeds to pinpoint chromosome locations where TEs have transposed. RelocaTE2 detects target site duplication (TSD) of TE insertions allowing it to report TE polymorphism loci with single base pair precision. Results and Discussion The performance of RelocaTE2 is evaluated using both simulated and real sequence data. RelocaTE2 demonstrate high level of sensitivity and specificity, particularly when the sequence coverage is not shallow. In comparison to other tools tested, RelocaTE2 achieves the best balance between sensitivity and specificity. In particular, RelocaTE2 performs best in prediction of TSDs for TE insertions. Even in highly repetitive regions, such as those tested on rice chromosome 4, RelocaTE2 is able to report up to 95% of simulated TE insertions with less than 0.1% false positive rate using 10-fold genome coverage resequencing data. RelocaTE2 provides a robust solution to identify TE insertion sites and can be incorporated into analysis workflows in support of describing the complete genotype from light coverage genome sequencing. 
    more » « less
  3. Suh, Alexander ; Chapman, Tracey (Ed.)
    Abstract It is unclear how mobile DNA sequences (transposable elements, hereafter TEs) invade eukaryotic genomes and reach stable copy numbers, as transposition can decrease host fitness. This challenge is particularly stark early in the invasion of a TE family at which point hosts may lack the specialized machinery to repress the spread of these TEs. One possibility (in addition to the evolution of host regulation of TEs) is that TE families may evolve to preferentially insert into chromosomal regions that are less likely to impact host fitness. This may allow the mean TE copy number to grow while minimizing the risk for host population extinction. To test this, we constructed simulations to explore how the transposition probability and insertion preference of a TE family influence the evolution of mean TE copy number and host population size, allowing for extinction. We find that the effect of a TE family’s insertion preference depends on a host’s ability to regulate this TE family. Without host repression, a neutral insertion preference increases the frequency of and decreases the time to population extinction. With host repression, a preference for neutral insertions minimizes the cumulative deleterious load, increases population fitness, and, ultimately, avoids triggering an extinction vortex. 
    more » « less
  4. Andrews, B J (Ed.)
    Abstract Intact transposable elements (TEs) account for 65% of the maize genome and can impact gene function and regulation. Although TEs comprise the majority of the maize genome and affect important phenotypes, genome-wide patterns of TE polymorphisms in maize have only been studied in a handful of maize genotypes, due to the challenging nature of assessing highly repetitive sequences. We implemented a method to use short-read sequencing data from 509 diverse inbred lines to classify the presence/absence of 445,418 nonredundant TEs that were previously annotated in four genome assemblies including B73, Mo17, PH207, and W22. Different orders of TEs (i.e., LTRs, Helitrons, and TIRs) had different frequency distributions within the population. LTRs with lower LTR similarity were generally more frequent in the population than LTRs with higher LTR similarity, though high-frequency insertions with very high LTR similarity were observed. LTR similarity and frequency estimates of nested elements and the outer elements in which they insert revealed that most nesting events occurred very near the timing of the outer element insertion. TEs within genes were at higher frequency than those that were outside of genes and this is particularly true for those not inserted into introns. Many TE insertional polymorphisms observed in this population were tagged by SNP markers. However, there were also 19.9% of the TE polymorphisms that were not well tagged by SNPs (R2 < 0.5) that potentially represent information that has not been well captured in previous SNP-based marker-trait association studies. This study provides a population scale genome-wide assessment of TE variation in maize and provides valuable insight on variation in TEs in maize and factors that contribute to this variation. 
    more » « less
  5. null (Ed.)
    Abstract Transposable elements (TEs) are ubiquitous DNA segments capable of moving from one site to another within host genomes. The extant distributions of TEs in eukaryotic genomes have been shaped by both bona fide TE integration preferences in eukaryotic genomes and by selection following integration. Here, we compare TE target site distribution in host genomes using multiple de novo transposon insertion datasets in both plants and animals and compare them in the context of genome-wide transcriptional landscapes. We showcase two distinct types of transcription-associated TE targeting strategies that suggest a process of convergent evolution among eukaryotic TE families. The integration of two precision-targeting elements are specifically associated with initiation of RNA Polymerase II transcription of highly expressed genes, suggesting the existence of novel mechanisms of precision TE targeting in addition to passive targeting of open chromatin. We also highlight two features that can facilitate TE survival and rapid proliferation: tissue-specific transposition and minimization of negative impacts on nearby gene function due to precision targeting. 
    more » « less