skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: A bioinformatic pipeline for identifying informative SNP panels for parentage assignment from RAD seq data
Abstract The development of high‐throughput sequencing technologies is dramatically increasing the use of single nucleotide polymorphisms (SNPs) across the field of genetics, but most parentage studies of wild populations still rely on microsatellites. We developed a bioinformatic pipeline for identifyingSNPpanels that are informative for parentage analysis from restriction site‐associatedDNAsequencing (RADseq) data. This pipeline includes options for analysis with or without a reference genome, and provides methods to maximize genotyping accuracy and select sets of unlinked loci that have high statistical power. We test this pipeline on small populations of Mexican gray wolf and bighorn sheep, for which parentage analyses are expected to be challenging due to low genetic diversity and the presence of many closely related individuals. We compare the results of parentage analysis acrossSNPpanels generated with or without the use of a reference genome, and betweenSNPs and microsatellites. For Mexican gray wolf, we conducted parentage analyses for 30 pups from a single cohort where samples were available from 64% of possible mothers and 53% of possible fathers, and the accuracy of parentage assignments could be estimated because true identities of parents were known a priori based on field data. For bighorn sheep, we conducted maternity analyses for 39 lambs from five cohorts where 77% of possible mothers were sampled, but true identities of parents were unknown. Analyses with and without a reference genome producedSNPpanels with ≥95% parentage assignment accuracy for Mexican gray wolf, outperforming microsatellites at 78% accuracy. Maternity assignments were completely consistent across allSNPpanels for the bighorn sheep, and were 74.4% consistent with assignments from microsatellites. Accuracy and consistency of parentage analysis were not reduced when using as few as 284SNPs for Mexican gray wolf and 142SNPs for bighorn sheep, indicating our pipeline can be used to developSNPgenotyping assays for parentage analysis with relatively small numbers of loci.  more » « less
Award ID(s):
1716698
PAR ID:
10063570
Author(s) / Creator(s):
 ;  ;  ;  ;  ;  ;  ;  
Publisher / Repository:
Wiley-Blackwell
Date Published:
Journal Name:
Molecular Ecology Resources
Volume:
18
Issue:
6
ISSN:
1755-098X
Page Range / eLocation ID:
p. 1263-1281
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Single‐nucleotide polymorphisms (SNPs) are preferred over microsatellite markers in many evolutionary studies, but have only recently been applied to studies of parentage. Evaluations ofSNPs and microsatellites for assigning parentage have mostly focused on special cases that require a relatively large number of heterozygous loci, such as species with low genetic diversity or with complex social structures. We developed 120SNPmarkers from a transcriptome assembled usingRNA‐sequencing of a songbird with the most common avian mating system—social monogamy. We compared the effectiveness of 97 novelSNPs and six previously described microsatellites for assigning paternity in the black‐throated blue warbler,Setophaga caerulescens. We show that the full panel of 97SNPs (meanHo = 0.19) was as powerful for assigning paternity as the panel of multiallelic microsatellites (meanHo = 0.86). Paternity assignments using the two marker types were in agreement for 92% of the offspring. Filtering individual samples by a 50% call rate andSNPs by a 75% call rate maximized the number of offspring assigned with 95% confidence usingSNPs. We also found that the 40 most heterozygousSNPs (meanHo = 0.37) had similar power to assign paternity as the full panel of 97SNPs. These findings demonstrate that a relatively small number of variableSNPs can be effective for parentage analyses in a socially monogamous species. We suggest that the development ofSNPmarkers is advantageous for studies that require high‐throughput genotyping or that plan to address a range of ecological and evolutionary questions. 
    more » « less
  2. Abstract Pleistocene diversity was much higher than today, for example there were three distinct wolf morphotypes (dire, gray, Beringian) in North America versus one today (gray). Previous fossil evidence suggested that these three groups overlapped ecologically, but split the landscape geographically. The Natural Trap Cave (NTC) fossil site in Wyoming,USAis an ideally placed late Pleistocene site to study the geographical movement of species from northern to middle North America before, during, and after the last glacial maximum. Until now, it has been unclear what type of wolf was present atNTC. We analyzed morphometrics of three wolf groups (dire, extant North American gray, Alaskan Beringian) to determine which wolves were present atNTCand what this indicates about wolf diversity and migration in Pleistocene North America. Results showNTCwolves group with Alaskan Beringian wolves. This provides the first morphological evidence for Beringian wolves in mid‐continental North America. Their location atNTCand their radiocarbon ages suggest that they followed a temporary channel through the glaciers. Results suggest high levels of competition and diversity in Pleistocene North American wolves. The presence of mid‐continental Beringian morphotypes adds important data for untangling the history of immigration and evolution ofCanisin North America. 
    more » « less
  3. Abstract Exome capture is an effective tool for surveying the genome for loci under selection. However, traditional methods require annotated genomic resources. Here, we present a method for creatingcDNAprobes from expressedmRNA, which are then used to enrich and capture genomicDNAfor exon regions. This approach, called “EecSeq,” eliminates the need for costly probe design and synthesis. We tested EecSeq in the eastern oyster,Crassostrea virginica, using a controlled exposure experiment. Four adult oysters were heat shocked at 36°C for 1 hr along with four control oysters kept at 14°C. StrandedmRNAlibraries were prepared for two individuals from each treatment and pooled. Half of the combined library was used for probe synthesis, and half was sequenced to evaluate capture efficiency. GenomicDNAwas extracted from all individuals, enriched via captured probes, and sequenced directly. We found that EecSeq had an average capture sensitivity of 86.8% across all known exons and had over 99.4% sensitivity for exons with detectable levels of expression in themRNAlibrary. For all mapped reads, over 47.9% mapped to exons and 37.0% mapped to expressed targets, which is similar to previously published exon capture studies. EecSeq displayed relatively even coverage within exons (i.e., minor “edge effects”) and even coverage across exonGCcontent. We discovered 5,951SNPs with a minimum average coverage of 80×, with 3,508SNPs appearing in exonic regions. We show that EecSeq provides comparable, if not superior, specificity and capture efficiency compared to costly, traditional methods. 
    more » « less
  4. Abstract Next‐generation sequencing technologies now allow researchers of non‐model systems to perform genome‐based studies without the requirement of a (often unavailable) closely related genomic reference. We evaluated the role of restriction endonuclease (RE) selection in double‐digest restriction‐site‐associatedDNAsequencing (ddRADseq) by generating reduced representation genome‐wide data using four differentREcombinations. Our expectation was thatREselections targeting longer, more complex restriction sites would recover fewer loci thanREwith shorter, less complex sites. We sequenced a diverse sample of non‐model arachnids, including five congeneric pairs of harvestmen (Opiliones) and four pairs of spiders (Araneae). Sample pairs consisted of either conspecifics or closely related congeneric taxa, and in total 26 sample pair analyses were tested. Sequence demultiplexing, read clustering and variant calling were performed in thepyRADprogram. The 6‐base pair cutterEcoRIcombined with methylated site‐specific 4‐base pair cutterMspIproduced, on average, the greatest numbers of intra‐individual loci and shared loci per sample pair. As expected, the number of shared loci recovered for a sample pair covaried with the degree of genetic divergence, estimated with cytochrome oxidase I sequences, although this relationship was non‐linear. Our comparative results will prove useful in guiding protocol selection for ddRADseq experiments on many arachnid taxa where reference genomes, even from closely related species, are unavailable. 
    more » « less
  5. Abstract NASA's Genesis Mission returned solar wind (SW) to the Earth for analysis to derive the composition of the solar photosphere from solar material.SWanalyses control the precision of the derived solar compositions, but their ultimate accuracy is limited by the theoretical or empirical models of fractionation due toSWformation. Mg isotopes are “ground truth” for these models since, except forCAIs, planetary materials have a uniform Mg isotopic composition (within ≤1‰) so any significant isotopic fractionation ofSWMg is primarily that ofSWformation and subsequent acceleration through the corona. This study analyzed Mg isotopes in a bulkSWdiamond‐like carbon (DLC) film on silicon collector returned by the Genesis Mission. A novel data reduction technique was required to account for variable ion yield and instrumental mass fractionation (IMF) in theDLC. The resultingSWMg fractionation relative to theDSM‐3 laboratory standard was (−14.4‰, −30.2‰) ± (4.1‰, 5.5‰), where the uncertainty is 2ơSEof the data combined with a 2.5‰ (total) error in theIMFdetermination. Two of theSWfractionation models considered generally agreed with our data. Their possible ramifications are discussed for O isotopes based on theCAInebular composition of McKeegan et al. (2011). 
    more » « less