Summary From a single transgenic line harboring fiveTnt1transposon insertions, we generated a near‐saturated insertion population inMedicago truncatula. Using thermal asymmetric interlaced‐polymerase chain reaction followed by sequencing, we recovered 388 888 flanking sequence tags (FSTs) from 21 741 insertion lines in this population.FSTrecovery from 14Tnt1lines using the whole‐genome sequencing (WGS) and/orTnt1‐capture sequencing approaches suggests an average of 80 insertions per line, which is more than the previous estimation of 25 insertions. Analysis of the distribution pattern and preference ofTnt1insertions showed thatTnt1is overall randomly distributed throughout theM. truncatulagenome. At the chromosomal level,Tnt1insertions occurred on both arms of all chromosomes, with insertion frequency negatively correlated with theGCcontent. Based on 174 546 filteredFSTs that show exact insertion locations in theM. truncatulagenome version 4.0 (Mt4.0), 0.44Tnt1insertions occurred per kb, and 19 583 genes containedTnt1with an average of 3.43 insertions per gene. Pathway and gene ontology analyses revealed thatTnt1‐inserted genes are significantly enriched in processes associated with ‘stress’, ‘transport’, ‘signaling’ and ‘stimulus response’. Surprisingly, gene groups with higher methylation frequency were more frequently targeted for insertion. Analysis of 19 583Tnt1‐inserted genes revealed that 59% (1265) of 2144 transcription factors, 63% (765) of 1216 receptor kinases and 56% (343) of 616 nucleotide‐binding site‐leucine‐rich repeat genes harbored at least oneTnt1insertion, compared with the overall 38% ofTnt1‐inserted genes out of 50 894 annotated genes in the genome.
more »
« less
Leveraging whole‐genome sequencing to estimate telomere length in plants
Abstract Changes in telomere length are increasingly used to indicate species' response to environmental stress across diverse taxa. Despite this broad use, few studies have explored telomere length in plants. Thus, evaluation of new approaches for measuring telomeres in plants is needed. Rapid advances in sequencing approaches and bioinformatic tools now allow estimation of telomere content from whole‐genome sequencing (WGS) data, a proxy for telomere length. While telomere content has been quantified extensively using quantitative polymerase chain reaction (qPCR) and WGS in humans, no study to date has compared the effectiveness of WGS in estimating telomere length in plants relative to qPCR approaches. In this study, we use 100Populusclones re‐sequenced using short‐read Illumina sequencing to quantify telomere length comparing three different bioinformatic approaches (Computel, K‐seek and TRIP) in addition to qPCR. Overall, telomere length estimates varied across different bioinformatic approaches, but were highly correlated across methods for individual genotypes. A positive correlation was observed between WGS estimates and qPCR, however, Computel estimates exhibited the greatest correlation. Computel incorporates genome coverage into telomere length calculations, suggesting that genome coverage is likely important to telomere length quantification when using WGS data. Overall, telomere estimates from WGS provided greater precision and accuracy of telomere length estimates relative to qPCR. The findings suggest WGS is a promising approach for assessing telomere length and, as the field of telomere ecology evolves, may provide added value to assaying response to biotic and abiotic environments for plants needed to accelerate plant breeding and conservation management.
more »
« less
- Award ID(s):
- 1856450
- PAR ID:
- 10486111
- Publisher / Repository:
- Wiley-Blackwell
- Date Published:
- Journal Name:
- Molecular Ecology Resources
- Volume:
- 24
- Issue:
- 2
- ISSN:
- 1755-098X
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
The diversity of avian visual phenotypes provides a framework for studying mechanisms of trait diversification generally, and the evolution of vertebrate vision, specifically. Previous research has focused on opsins, but to fully understand visual adaptation, we must study the complete phototransduction cascade (PTC). Here, we developed a probe set that captures exonic regions of 46 genes representing the PTC and other light responses. For a subset of species, we directly compared gene capture between our probe set and low-coverage whole genome sequencing (WGS), and we discuss considerations for choosing between these methods. Finally, we developed a unique strategy to avoid chimeric assembly by using “decoy” reference sequences. We successfully captured an average of 64% of our targeted exome in 46 species across 14 orders using the probe set and had similar recovery using the WGS data. Compared to WGS or transcriptomes, our probe set: (1) reduces sequencing requirements by efficiently capturing vision genes, (2) employs a simpler bioinformatic pipeline by limiting required assembly and negating annotation, and (3) eliminates the need for fresh tissues, enabling researchers to leverage existing museum collections. We then utilized our vision exome data to identify positively selected genes in two evolutionary scenarios—evolution of night vision in nocturnal birds and evolution of high-speed vision specific to manakins (Pipridae). We found parallel positive selection of SLC24A1 in both scenarios, implicating the alteration of rod response kinetics, which could improve color discrimination in dim light conditions and/or facilitate higher temporal resolution.more » « less
-
Abstract Low‐coverage whole‐genome sequencing (WGS) is increasingly used for the study of evolution and ecology in both model and non‐model organisms; however, effective application of low‐coverage WGS data requires the implementation of probabilistic frameworks to account for the uncertainties in genotype likelihoods.Here, we present a probabilistic framework for using genotype likelihoods for standard population assignment applications. Additionally, we derive the Fisher information for allele frequency from genotype likelihoods and use that to describe a novel metric, theeffective sample size, which figures heavily in assignment accuracy. We make these developments available for application through WGSassign, an open‐source software package that is computationally efficient for working with whole‐genome data.Using simulated and empirical data sets, we demonstrate the behaviour of our assignment method across a range of population structures, sample sizes and read depths. Through these results, we show that WGSassign can provide highly accurate assignment, even for samples with low average read depths (<0.01X) and among weakly differentiated populations.Our simulation results highlight the importance of equalizing the effective sample sizes among source populations in order to achieve accurate population assignment with low‐coverage WGS data. We further provide study design recommendations for population assignment studies and discuss the broad utility of effective sample size for studies using low‐coverage WGS data.more » « less
-
Abstract Museum specimens collected prior to cryogenic tissue storage are increasingly being used as genetic resources, and though high‐throughput sequencing is becoming more cost‐efficient, whole genome sequencing (WGS) of historical DNA (hDNA) remains inefficient and costly due to its short fragment sizes and high loads of exogenous DNA, among other factors. It is also unclear how sequencing efficiency is influenced by DNA sources. We aimed to identify the most efficient method and DNA source for collecting WGS data from avian museum specimens. We analyzed low‐coverage WGS from 60 DNA libraries prepared from four American Robin ( Turdus migratorius ) and four Abyssinian Thrush ( Turdus abyssinicus ) specimens collected in the 1920s. We compared DNA source (toepad versus incision‐line skin clip) and three library preparation methods: (1) double‐stranded DNA (dsDNA), single tube (KAPA); (2) single‐stranded DNA (ssDNA), multi‐tube (IDT); and (3) ssDNA, single tube (Claret Bioscience). We found that the ssDNA, multi‐tube method resulted in significantly greater endogenous DNA content, average read length, and sequencing efficiency than the other tested methods. We also tested whether a predigestion step reduced exogenous DNA in libraries from one specimen per species and found promising results that warrant further study. The ~10% increase in average sequencing efficiency of the best‐performing method over a commonly implemented dsDNA library preparation method has the potential to significantly increase WGS coverage of hDNA from bird specimens. Future work should evaluate the threshold for specimen age at which these results hold and how the combination of library preparation method and DNA source influence WGS in other taxa.more » « less
-
ABSTRACT Metagenomics is a powerful tool for characterising viruses, with broad applications across diverse disciplines, from understanding the ecology and evolutionary history of viruses to identifying causative agents of emerging outbreaks with unknown aetiology. Additionally, metagenomic data contains valuable information about the amount of virus present within samples. However, we have yet to leverage metagenomics to assess viral load, which is a key epidemiological parameter. To effectively use sequencing outputs to inform transmission, we need to understand the relationship between read depth and viral load across a diverse set of viruses. Here, using target enrichment sequencing, we investigated the detection and recovery of virus genomes by spiking known concentrations of DNA and RNA viruses into wild rodent faecal samples. In total, 15 experimental replicates were sequenced with target enrichment sequencing and compared to shotgun sequencing of the same background samples. Target enriched sequencing recovered all spike-in viruses at every concentration (102, 103, and 105± 1 log genome copies) and showed a log-linear relationship between spike-in concentration and mean read depth. Background viruses (includingKobuvirusandCardiovirus) were recovered consistently across all biological and technical replicates, but genome coverage was variable between virus genera and likely reflected the composition of target enrichment probe panel. Overall, our study highlights the strengths and weaknesses of using commercially available panels to quantify and characterise wildlife viromes, and underscores the importance of probe panel design for accurately interpreting coverage and read depth. To advance the use of metagenomics for understanding virus transmission, further research will be needed to elucidate how sequencing strategy (e.g. library depth, pooling), virome composition, and probe design influence viral read counts and genome coverage.more » « less
An official website of the United States government
