skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Strategies for reducing per‐sample costs in target capture sequencing for phylogenomics and population genomics in plants
The reduced cost of high‐throughput sequencing and the development of gene sets with wide phylogenetic applicability has led to the rise of sequence capture methods as a plausible platform for both phylogenomics and population genomics in plants. An important consideration in large targeted sequencing projects is the per‐sample cost, which can be inflated when using off‐the‐shelf kits or reagents not purchased in bulk. Here, we discuss methods to reduce per‐sample costs in high‐throughput targeted sequencing projects. We review the minimal equipment and consumable requirements for targeted sequencing while comparing several alternatives to reduce bulk costs inDNAextraction, library preparation, target enrichment, and sequencing. We consider how each of the workflow alterations may be affected byDNAquality (e.g., fresh vs. herbarium tissue), genome size, and the phylogenetic scale of the project. We provide a cost calculator for researchers considering targeted sequencing to use when designing projects, and identify challenges for future development of low‐cost sequencing in non‐model plant systems.  more » « less
Award ID(s):
1753800 1711391
PAR ID:
10455185
Author(s) / Creator(s):
 ;  ;  ;  ;  
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
Applications in Plant Sciences
Volume:
8
Issue:
4
ISSN:
2168-0450
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. PremiseThe ability to sequence genome‐scale data from herbarium specimens would allow for the economical development of data sets with broad taxonomic and geographic sampling that would otherwise not be possible. Here, we evaluate the utility of a basic double‐digest restriction site–associatedDNAsequencing (ddRADseq) protocol usingDNAs from four genera extracted from both silica‐dried and herbarium tissue. MethodsDNAs fromDraba,Boechera,Solidago, andIlexwere processed with a ddRADseq protocol. The effects ofDNAdegradation, taxon, and specimen age were assessed. ResultsAlthough taxon, preservation method, and specimen age affected data recovery, large phylogenetically informative data sets were obtained from the majority of samples. DiscussionThese results suggest that herbarium samples can be incorporated into ddRADseq project designs, and that specimen age can be used as a rapid on‐site guide for sample choice. The detailed protocol we provide will allow users to pursue herbarium‐based ddRADseq projects that minimize the expenses associated with fieldwork and sample evaluation. 
    more » « less
  2. Abstract Next‐generation sequencing technologies now allow researchers of non‐model systems to perform genome‐based studies without the requirement of a (often unavailable) closely related genomic reference. We evaluated the role of restriction endonuclease (RE) selection in double‐digest restriction‐site‐associatedDNAsequencing (ddRADseq) by generating reduced representation genome‐wide data using four differentREcombinations. Our expectation was thatREselections targeting longer, more complex restriction sites would recover fewer loci thanREwith shorter, less complex sites. We sequenced a diverse sample of non‐model arachnids, including five congeneric pairs of harvestmen (Opiliones) and four pairs of spiders (Araneae). Sample pairs consisted of either conspecifics or closely related congeneric taxa, and in total 26 sample pair analyses were tested. Sequence demultiplexing, read clustering and variant calling were performed in thepyRADprogram. The 6‐base pair cutterEcoRIcombined with methylated site‐specific 4‐base pair cutterMspIproduced, on average, the greatest numbers of intra‐individual loci and shared loci per sample pair. As expected, the number of shared loci recovered for a sample pair covaried with the degree of genetic divergence, estimated with cytochrome oxidase I sequences, although this relationship was non‐linear. Our comparative results will prove useful in guiding protocol selection for ddRADseq experiments on many arachnid taxa where reference genomes, even from closely related species, are unavailable. 
    more » « less
  3. Abstract The development of high‐throughput sequencing technologies is dramatically increasing the use of single nucleotide polymorphisms (SNPs) across the field of genetics, but most parentage studies of wild populations still rely on microsatellites. We developed a bioinformatic pipeline for identifyingSNPpanels that are informative for parentage analysis from restriction site‐associatedDNAsequencing (RADseq) data. This pipeline includes options for analysis with or without a reference genome, and provides methods to maximize genotyping accuracy and select sets of unlinked loci that have high statistical power. We test this pipeline on small populations of Mexican gray wolf and bighorn sheep, for which parentage analyses are expected to be challenging due to low genetic diversity and the presence of many closely related individuals. We compare the results of parentage analysis acrossSNPpanels generated with or without the use of a reference genome, and betweenSNPs and microsatellites. For Mexican gray wolf, we conducted parentage analyses for 30 pups from a single cohort where samples were available from 64% of possible mothers and 53% of possible fathers, and the accuracy of parentage assignments could be estimated because true identities of parents were known a priori based on field data. For bighorn sheep, we conducted maternity analyses for 39 lambs from five cohorts where 77% of possible mothers were sampled, but true identities of parents were unknown. Analyses with and without a reference genome producedSNPpanels with ≥95% parentage assignment accuracy for Mexican gray wolf, outperforming microsatellites at 78% accuracy. Maternity assignments were completely consistent across allSNPpanels for the bighorn sheep, and were 74.4% consistent with assignments from microsatellites. Accuracy and consistency of parentage analysis were not reduced when using as few as 284SNPs for Mexican gray wolf and 142SNPs for bighorn sheep, indicating our pipeline can be used to developSNPgenotyping assays for parentage analysis with relatively small numbers of loci. 
    more » « less
  4. Abstract Members of the order Isochrysidales are unique among haptophyte lineages in being the exclusive producers of alkenones, long‐chain ketones that are commonly used for paleotemperature reconstructions. Alkenone‐producing haptophytes are divided into three major groups based largely on molecular ecological data: Group I is found in freshwater lakes, GroupIIcommonly occurs in brackish and coastal marine environments, and GroupIIIconsists of open ocean species. Each group has distinct alkenone distributions; however, only GroupsIIandIIIIsochrysidales currently have cultured representatives. The uncultured Group I Isochrysidales are distinguished geochemically by the presence of tri‐unsaturated alkenone isomers (C37:3bMe, C38:3bEt, C38:3bMe, C39:3bEt) present in water column and sediment samples, yet their genetic diversity, morphology, and environmental controls are largely unknown. Using small‐subunit (SSU) ribosomalRNA(rRNA) marker gene amplicon high‐throughput sequencing of environmental water column and sediment samples, we show that Group I is monophyletic with high phylogenetic diversity and contains a well‐supported clade separating the previously described “EV” clade from the “Greenland” clade. We infer the first partial large‐subunit (LSU)rRNAgene Group I sequence phylogeny, which uncovered additional well‐supported clades embedded within Group I. Relative to GroupII, Group I revealed higher levels of genetic diversity despite conservation of alkenone signatures and a closer evolutionary relationship with GroupIII. In Group I, the presence of the tri‐unsaturated alkenone isomers appears to be conserved, which is not the case for GroupII. This suggests differing environmental influences on Group I andIIand perhaps uncovers evolutionary constraints on alkenone biosynthesis. 
    more » « less
  5. Abstract Single‐nucleotide polymorphisms (SNPs) are preferred over microsatellite markers in many evolutionary studies, but have only recently been applied to studies of parentage. Evaluations ofSNPs and microsatellites for assigning parentage have mostly focused on special cases that require a relatively large number of heterozygous loci, such as species with low genetic diversity or with complex social structures. We developed 120SNPmarkers from a transcriptome assembled usingRNA‐sequencing of a songbird with the most common avian mating system—social monogamy. We compared the effectiveness of 97 novelSNPs and six previously described microsatellites for assigning paternity in the black‐throated blue warbler,Setophaga caerulescens. We show that the full panel of 97SNPs (meanHo = 0.19) was as powerful for assigning paternity as the panel of multiallelic microsatellites (meanHo = 0.86). Paternity assignments using the two marker types were in agreement for 92% of the offspring. Filtering individual samples by a 50% call rate andSNPs by a 75% call rate maximized the number of offspring assigned with 95% confidence usingSNPs. We also found that the 40 most heterozygousSNPs (meanHo = 0.37) had similar power to assign paternity as the full panel of 97SNPs. These findings demonstrate that a relatively small number of variableSNPs can be effective for parentage analyses in a socially monogamous species. We suggest that the development ofSNPmarkers is advantageous for studies that require high‐throughput genotyping or that plan to address a range of ecological and evolutionary questions. 
    more » « less