skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Metagenomic clustering reveals microbial contamination as an essential consideration in ultraconserved element design for phylogenomics with insect museum specimens
Abstract Phylogenomics via ultraconserved elements (UCEs) has led to improved phylogenetic reconstructions across the tree of life. However, inadvertently incorporating non‐targeted DNA into the UCE marker design will lead to misinformation being incorporated into subsequent analyses. To date, the effectiveness of basic metagenomic filtering strategies has not been assessed in arthropods. Designing markers from museum specimens requires careful consideration of methods due to the high levels of microbial contamination typically found in such specimens. We investigate if contaminant sequences are carried forward into a UCE marker set we developed from insect museum specimens using a standard bioinformatics pipeline. We find that the methods currently employed by most researchers do not exclude contamination from the final set of targets. Lastly, we highlight several paths forward for reducing contamination in UCE marker design.  more » « less
Award ID(s):
1856402
PAR ID:
10371073
Author(s) / Creator(s):
 ;  ;  ;  ;  
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
Ecology and Evolution
Volume:
12
Issue:
3
ISSN:
2045-7758
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Marvaldi, Adriana (Ed.)
    Abstract Tailoring ultraconserved element (UCE) probe set design to focal taxa has been demonstrated to improve locus recovery and phylogenomic inference. However, beyond conducting expensive in vitro testing, it remains unclear how best to determine whether an existing UCE probe set is likely to suffice for phylogenomic inference or whether tailored probe design will be desirable. Here we investigate the utility of 8 different UCE probe sets for the in silico phylogenomic inference of scarabaeoid beetles. Probe sets tested differed in terms of (i) how phylogenetically distant from Scarabaeoidea taxa those used during probe design are, (ii) breadth of phylogenetic inference probe set was designed for, and (iii) method of probe design. As part of this study, 2 new UCE probe sets are produced for the beetle family Scarabaeidae and superfamily Hydrophiloidea. We confirm that probe set utility decreases with increasing phylogenetic distance from target taxa. In addition, narrowing the phylogenetic breadth of probe design decreases the phylogenetic capture range. We also confirm previous findings regarding ways to optimize UCE probe design. Finally, we make suggestions regarding assessment of need for de novo probe design. 
    more » « less
  2. Abstract Numerous genomic methods developed over the past two decades have enabled the discovery and extraction of orthologous loci to help resolve phylogenetic relationships across various taxa and scales. Genome skimming (or low‐coverage genome sequencing) is a promising method to not only extract high‐copy loci but also 100s to 1000s of phylogenetically informative nuclear loci (e.g., ultraconserved elements [UCEs] and exons) from contemporary and museum samples. The subphylum Anthozoa, including important ecosystem engineers (e.g., stony corals, black corals, anemones, and octocorals) in the marine environment, is in critical need of phylogenetic resolution and thus might benefit from a genome‐skimming approach. We conducted genome skimming on 242 anthozoan corals collected from 1886 to 2022. Using existing target‐capture baitsets, we bioinformatically obtained UCEs and exons from the genome‐skimming data and incorporated them with data from previously published target‐capture studies. The mean number of UCE and exon loci extracted from the genome skimming data was 1837 ± 662 SD for octocorals and 1379 ± 476 SD loci for hexacorals. Phylogenetic relationships were well resolved within each class. A mean of 1422 ± 720 loci was obtained from the historical specimens, with 1253 loci recovered from the oldest specimen collected in 1886. We also obtained partial to whole mitogenomes and nuclear rRNA genes from >95% of samples. Bioinformatically pulling UCEs, exons, mitochondrial genomes, and nuclear rRNA genes from genome skimming data is a viable and low‐cost option for phylogenetic studies. This approach can be used to review and support taxonomic revisions and reconstruct evolutionary histories, including historical museum and type specimens. 
    more » « less
  3. Abstract Next‐generation sequencing has greatly expanded the utility and value of museum collections by revealing specimens as genomic resources. As the field of museum genomics grows, so does the need for extraction methods that maximize DNA yields. For avian museum specimens, the established method of extracting DNA from toe pads works well for most specimens. However, for some specimens, especially those of birds that are very small or very large, toe pads can be a poor source of DNA. In this study, we apply two DNA extraction methods (phenol–chloroform and silica column) to three different sources of DNA (toe pad, skin punch and bone) from 10 historical avian museum specimens. We show that a modified phenol–chloroform protocol yielded significantly more DNA than a silica column protocol (e.g., Qiagen DNeasy Blood & Tissue Kit) across all tissue types. However, extractions using the silica column protocol contained longer fragments on average than those using the phenol–chloroform protocol, probably as a result of loss of small fragments through the silica column. While toe pads yielded more DNA than skin punches and bone fragments, skin punches proved to be a reliable alternative source of DNA and might be especially appealing when toe pad extractions are impractical. Overall, we found that historical bird museum specimens contain substantial amounts of DNA for genomic studies under most extraction scenarios, but that a phenol–chloroform protocol consistently provides the high quantities of DNA required for most current genomic protocols. 
    more » « less
  4. Abstract Marker selection has emerged as an important component of phylogenomic study design due to rising concerns of the effects of gene tree estimation error, model misspecification, and data-type differences. Researchers must balance various trade-offs associated with locus length and evolutionary rate among other factors. The most commonly used reduced representation data sets for phylogenomics are ultraconserved elements (UCEs) and Anchored Hybrid Enrichment (AHE). Here, we introduce Rapidly Evolving Long Exon Capture (RELEC), a new set of loci that targets single exons that are both rapidly evolving (evolutionary rate faster than RAG1) and relatively long in length (>1,500 bp), while at the same time avoiding paralogy issues across amniotes. We compare the RELEC data set to UCEs and AHE in squamate reptiles by aligning and analyzing orthologous sequences from 17 squamate genomes, composed of 10 snakes and 7 lizards. The RELEC data set (179 loci) outperforms AHE and UCEs by maximizing per-locus genetic variation while maintaining presence and orthology across a range of evolutionary scales. RELEC markers show higher phylogenetic informativeness than UCE and AHE loci, and RELEC gene trees show greater similarity to the species tree than AHE or UCE gene trees. Furthermore, with fewer loci, RELEC remains computationally tractable for full Bayesian coalescent species tree analyses. We contrast RELEC to and discuss important aspects of comparable methods, and demonstrate how RELEC may be the most effective set of loci for resolving difficult nodes and rapid radiations. We provide several resources for capturing or extracting RELEC loci from other amniote groups. 
    more » « less
  5. This exercise is intended to provide students with a real-world example of how museum specimens can provide additional context to challenges and considerations in the broader field of applied conservation genomics. After a brief introduction to the study system—the conservation status of Iberian lynx (Lynx pardinus)—students will be asked to review an open-access, peer-reviewed publication that features samples from varied museum, archaeological, and paleontological contexts. Through reading the guide and discussion questions, students will reflect upon study design, concepts, and challenges presented at the intersection between the fields of conservation genomics and museum-based studies. The exercise ends with students breaking down the research into the main components (e.g., research question, independent and dependent variables, hypotheses, predictions) to set a structure for critically reading other scientific studies or designing their own research question. 
    more » « less