Exon markers have a long history of use in phylogenetics of ray‐finned fishes, the most diverse clade of vertebrates with more than 35,000 species. As the number of published genomes increases, it has become easier to test exons and other genetic markers for signals of ancient duplication events and filter out paralogues that can mislead phylogenetic analysis. We present seven new probe sets for current target‐capture phylogenomic protocols that capture 1,104 exons explicitly filtered for paralogues using gene trees. These seven probe sets span the diversity of teleost fishes, including four sets that target five hyperdiverse percomorph clades which together comprise ca. 17,000 species (Carangaria, Ovalentaria, Eupercaria, and Syngnatharia + Pelagiaria combined). We additionally included probes to capture legacy nuclear exons and mitochondrial markers that have been commonly used in fish phylogenetics (despite some exons being flagged for paralogues) to facilitate integration of old and new molecular phylogenetic matrices. We tested these probes experimentally for 56 fish species (eight species per probe set) and merged new exon‐capture sequence data into an existing data matrix of 1,104 exons and 300 ray‐finned fish species. We provide an optimized bioinformatics pipeline to assemble exon capture data from raw reads to alignments for downstream analysis.more »
Custom sequence capture experiments are becoming an efficient approach for gathering large sets of orthologous markers in nonmodel organisms. Transcriptome‐based exon capture utilizes transcript sequences to design capture probes, typically using a reference genome to identify intron–exon boundaries to exclude shorter exons (<200 bp). Here, we test directly using transcript sequences for probe design, which are often composed of multiple exons of varying lengths. Using 1260 orthologous transcripts, we conducted sequence captures across multiple phylogenetic scales for frogs, including outgroups ~100 Myr divergent from the ingroup. We recovered a large phylogenomic data set consisting of sequence alignments for 1047 of the 1260 transcriptome‐based loci (~561 000 bp) and a large quantity of highly variable regions flanking the exons in transcripts (~70 000 bp), the latter improving substantially by only including ingroup species (~797 000 bp). We recovered both shorter (<100 bp) and longer exons (>200 bp), with no major reduction in coverage towards the ends of exons. We observed significant differences in the performance of blocking oligos for target enrichment and nontarget depletion during captures, and differences in
- Publication Date:
- NSF-PAR ID:
- 10243953
- Journal Name:
- Molecular Ecology Resources
- Volume:
- 16
- Issue:
- 5
- Page Range or eLocation-ID:
- p. 1069-1083
- ISSN:
- 1755-098X
- Publisher:
- Wiley-Blackwell
- Sponsoring Org:
- National Science Foundation
More Like this
-
Abstract -
Abstract Aim The Lesser Sunda Islands are situated between the Sunda and Sahul Shelves, with a linear arrangement that has functioned as a two‐way filter for taxa dispersing between the Asian and Australo‐Papuan biogeographical realms. Distributional patterns of many terrestrial vertebrates suggest a stepping‐stone model of island colonization. Here we investigate the timing and sequence of island colonization in Asian‐origin fanged frogs from the volcanic Sunda Arc islands with the goal of testing the stepping‐stone model of island colonization.
Location The Indonesian islands of Java, Lombok, Sumbawa, Flores and Lembata.
Taxon Limnonectes dammermani andL. kadarsani (Family: Dicroglossidae)Methods Mitochondrial
DNA was sequenced from 153 frogs to identify major lineages and to select samples for an exon‐capture experiment. We designed probes to capture sequence data from 974 exonic loci (1,235,981 bp) from 48 frogs including the outgroup species,L. microdiscus . The resulting data were analysed using phylogenetic, population genetic and biogeographical model testing methods.Results The mt
DNA phylogeny findsL. kadarsani paraphyletic with respect toL. dammermani , with a pectinate topology consistent with the stepping‐stone model. Phylogenomic analyses of 974 exons recovered the two species as monophyletic sister taxa that diverged ~7.6 Ma with no detectable contemporary gene flow, suggesting introgression of theL. dammermani mitochondrion intoL. kadarsani on Lombok resulting from an isolated ancient hybridization event ~4 Ma.more »Main conclusions These results suggest that the currently accepted stepping‐stone model of island colonization might not best explain the current patterns of diversity in the archipelago. The high degree of genetic structure, large divergence times, and absent or low levels of migration between lineages suggests that
L. kadarsani represents five distinct species. -
Abstract Marker selection has emerged as an important component of phylogenomic study design due to rising concerns of the effects of gene tree estimation error, model misspecification, and data-type differences. Researchers must balance various trade-offs associated with locus length and evolutionary rate among other factors. The most commonly used reduced representation data sets for phylogenomics are ultraconserved elements (UCEs) and Anchored Hybrid Enrichment (AHE). Here, we introduce Rapidly Evolving Long Exon Capture (RELEC), a new set of loci that targets single exons that are both rapidly evolving (evolutionary rate faster than RAG1) and relatively long in length (>1,500 bp), while at the same time avoiding paralogy issues across amniotes. We compare the RELEC data set to UCEs and AHE in squamate reptiles by aligning and analyzing orthologous sequences from 17 squamate genomes, composed of 10 snakes and 7 lizards. The RELEC data set (179 loci) outperforms AHE and UCEs by maximizing per-locus genetic variation while maintaining presence and orthology across a range of evolutionary scales. RELEC markers show higher phylogenetic informativeness than UCE and AHE loci, and RELEC gene trees show greater similarity to the species tree than AHE or UCE gene trees. Furthermore, with fewer loci, RELEC remains computationally tractablemore »
-
Abstract Phylogenomic analysis of large genome-wide sequence data sets can resolve phylogenetic tree topologies for large species groups, help test the accuracy of and improve resolution for earlier multi-locus studies and reveal the level of agreement or concordance within partitions of the genome for various tree topologies. Here we used a target-capture approach to sequence 1088 single-copy exons for more than 200 labrid fishes together with more than 100 outgroup taxa to generate a new data-rich phylogeny for the family Labridae. Our time-calibrated phylogenetic analysis of exon-capture data pushes the root node age of the family Labridae back into the Cretaceous to about 79 Ma years ago. The monotypic Centrogenys vaigiensis, and the order Uranoscopiformes (stargazers) are identified as the sister lineages of Labridae. The phylogenetic relationships among major labrid subfamilies and within these clades were largely congruent with prior analyses of select mitochondrial and nuclear datasets. However, the position of the tribe Cirrhilabrini (fairy and flame wrasses) showed discordance, resolving either as the sister to a crown julidine clade or alternatively sister to a group formed by the labrines, cheilines and scarines. Exploration of this pattern using multiple approaches leads to slightly higher support for this latter hypothesis, highlightingmore »
-
Abstract Identification of genes underlying genomic signatures of natural selection is key to understanding adaptation to local conditions. We used targeted resequencing to identify
SNP markers in 5321 candidate adaptive genes associated with known immunological, metabolic and growth functions in ovids and other ungulates. We selectively targeted 8161 exons in protein‐coding and nearby 5′ and 3′ untranslated regions of chosen candidate genes. Targeted sequences were taken from bighorn sheep (Ovis canadensis ) exon capture data and directly from the domestic sheep genome (Ovis aries v. 3; oviAri3). The bighorn sheep sequences used in the Dall's sheep (Ovis dalli dalli ) exon capture aligned to 2350 genes on the oviAri3 genome with an average of 2 exons each. We developed a microfluidic qPCR‐basedSNP chip to genotype 476 Dall's sheep from locations across their range and test for patterns of selection. Using multiple corroborating approaches (lositan andbayescan ), we detected 28SNP loci potentially under selection. We additionally identified candidate loci significantly associated with latitude, longitude, precipitation and temperature, suggesting local environmental adaptation. The three methods demonstrated consistent support for natural selection on nine genes with immune and disease‐regulating functions (e.g. Ovar‐DRA ,APC ,BATF 2,MAGEB 18), cell regulation signalling pathways (e.g.KRIT 1,PI 3K,ORRC 3), and respiratory health (CYSLTR 1). Characterizing adaptivemore »