skip to main content


The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 11:00 PM ET on Thursday, May 23 until 2:00 AM ET on Friday, May 24 due to maintenance. We apologize for the inconvenience.

This content will become publicly available on July 1, 2024

Title: To design, or not to design? Comparison of beetle ultraconserved element probe set utility based on phylogenetic distance, breadth, and method of probe design

Tailoring ultraconserved element (UCE) probe set design to focal taxa has been demonstrated to improve locus recovery and phylogenomic inference. However, beyond conducting expensive in vitro testing, it remains unclear how best to determine whether an existing UCE probe set is likely to suffice for phylogenomic inference or whether tailored probe design will be desirable. Here we investigate the utility of 8 different UCE probe sets for the in silico phylogenomic inference of scarabaeoid beetles. Probe sets tested differed in terms of (i) how phylogenetically distant from Scarabaeoidea taxa those used during probe design are, (ii) breadth of phylogenetic inference probe set was designed for, and (iii) method of probe design. As part of this study, 2 new UCE probe sets are produced for the beetle family Scarabaeidae and superfamily Hydrophiloidea. We confirm that probe set utility decreases with increasing phylogenetic distance from target taxa. In addition, narrowing the phylogenetic breadth of probe design decreases the phylogenetic capture range. We also confirm previous findings regarding ways to optimize UCE probe design. Finally, we make suggestions regarding assessment of need for de novo probe design.

more » « less
Award ID(s):
Author(s) / Creator(s):
; ; ; ;
Marvaldi, Adriana
Publisher / Repository:
Oxford Academics
Date Published:
Journal Name:
Insect Systematics and Diversity
Medium: X
Sponsoring Org:
National Science Foundation
More Like this

    Adephaga is the second largest suborder of beetles (Coleoptera) and they serve as important arthropod predators in both aquatic and terrestrial ecosystems. The suborder is divided into Geadephaga comprising terrestrial families and Hydradephaga for aquatic lineages. Despite numerous studies, phylogenetic relationships among the adephagan families and monophyly of the Hydradephaga itself remain in question. Here we conduct a comprehensive phylogenomic analysis of the suborder using ultraconserved elements (UCEs). This study presents the first in vitro test of a newly developed UCE probe set customized for use within Adephaga that includes both probes tailored specifically for the suborder, alongside generalized Coleoptera probes previously found to work in adephagan taxa. We assess the utility of the entire probe set, as well as comparing the tailored and generalized probes alone for reconstructing evolutionary relationships. Our analyses recovered strong support for the paraphyly of Hydradephaga with whirligig beetles (Gyrinidae) placed as sister to all other adephagan families. Geadephaga was strongly supported as monophyletic and placed sister to a clade composed of Haliplidae + Dytiscoidea. Monophyly of Dytiscoidea was strongly supported with relationships among the dytiscoid families resolved and strongly supported. Relationships among the subfamilies of Dytiscidae were strongly supported but largely incongruent with prior phylogenetic estimates for the family. The results of our UCE probe comparison showed that tailored probes alone outperformed generalized probes alone, as well as the full combined probe set (containing both types of probes), under decreased taxon sampling. When taxon sampling was increased, the full combined probe set outperformed both tailored probes and generalized probes alone. This study provides further evidence that UCE probe sets customized for a focal group result in a greater number of recovered loci and substantially improve phylogenomic analysis.

    more » « less
  2. This repository contains Materials and designed UCE probe sets for the manuscript entitled "To design or not to design? Comparison of beetle ultraconserved element probe set utility based on phylogenetic distance, breadth, and method of probe design".

    more » « less
  3. Abstract

    Next‐generation sequencing technologies (NGS) allow systematists to amass a wealth of genomic data from non‐model species for phylogenetic resolution at various temporal scales. However, phylogenetic inference for many lineages dominated by non‐model species has not yet benefited from NGS, which can complement Sanger sequencing studies. One such lineage, whose phylogenetic relationships remain uncertain, is the diverse, agriculturally important and charismatic Coreoidea (Hemiptera: Heteroptera). Given the lack of consensus on higher‐level relationships and the importance of a robust phylogeny for evolutionary hypothesis testing, we use a large data set comprised of hundreds of ultraconserved element (UCE) loci to infer the phylogeny of Coreoidea (excluding Stenocephalidae and Hyocephalidae), with emphasis on the families Coreidae and Alydidae. We generated three data sets by including alignments that contained loci sampled for at least 50%, 60%, or 70% of the total taxa, and inferred phylogeny using maximum likelihood and summary coalescent methods. Twenty‐six external morphological features used in relatively comprehensive phylogenetic analyses of coreoids were also re‐evaluated within our molecular phylogenetic framework. We recovered 439–970 loci per species (16%–36% of loci targeted) and combined this with previously generated UCE data for 12 taxa. All data sets, regardless of analytical approach, yielded topologically similar and strongly supported trees, with the exception of outgroup relationships and the position of Hydarinae. We recovered a monophyletic Coreoidea, with Rhopalidae highly supported as the sister group to Alydidae + Coreidae. Neither Alydidae nor Coreidae were monophyletic; the coreid subfamilies Hydarinae and Pseudophloeinae were recovered as more closely related to Alydidae than to other coreid subfamilies. Coreinae were paraphyletic with respect to Meropachyinae. Most morphological traits were homoplastic with several clades defined by few, if any, synapomorphies. Our results demonstrate the utility of phylogenomic approaches in generating robust hypotheses for taxa with long‐standing phylogenetic problems and highlight that novel insights may come from such approaches.

    more » « less
  4. Premise

    Cornales is an order of flowering plants containing ecologically and horticulturally important families, including Cornaceae (dogwoods) and Hydrangeaceae (hydrangeas), among others. While many relationships in Cornales are strongly supported by previous studies, some uncertainty remains with regards to the placement of Hydrostachyaceae and to relationships among families in Cornales and within Cornaceae. Here we analyzed hundreds of nuclear loci to test published phylogenetic hypotheses and estimated a robust species tree for Cornales.


    Using the Angiosperms353 probe set and existing data sets, we generated phylogenomic data for 158 samples, representing all families in the Cornales, with intensive sampling in the Cornaceae.


    We curated an average of 312 genes per sample, constructed maximum likelihood gene trees, and inferred a species tree using the summary approach implemented in ASTRAL‐III, a method statistically consistent with the multispecies coalescent model.


    The species tree we constructed generally shows high support values and a high degree of concordance among individual nuclear gene trees. Relationships among families are largely congruent with previous molecular studies, except for the placement of the nyssoids and the Grubbiaceae‐Curtisiaceae clades. Furthermore, we were able to place Hydrostachyaceae within Cornales, and within Cornaceae, the monophyly of known morphogroups was well supported. However, patterns of gene tree discordance suggest potential ancient reticulation, gene flow, and/or ILS in the Hydrostachyaceae lineage and the early diversification ofCornus. Our findings reveal new insights into the diversification process across Cornales and demonstrate the utility of the Angiosperms353 probe set.

    more » « less

    New sequencing technologies facilitate the generation of large‐scale molecular data sets for constructing the plant tree of life. We describe a new probe set for target enrichment sequencing to generate nuclear sequence data to build phylogenetic trees with any flagellate land plants, including hornworts, liverworts, mosses, lycophytes, ferns, and all gymnosperms.


    We leveraged existing transcriptome and genome sequence data to design the GoFlag 451 probes, a set of 56,989 probes for target enrichment sequencing of 451 exons that are found in 248 single‐copy or low‐copy nuclear genes across flagellate plant lineages.


    Our results indicate that target enrichment using the GoFlag451 probe set can provide large nuclear data sets that can be used to resolve relationships among both distantly and closely related taxa across the flagellate land plants. We also describe the GoFlag 408 probes, an optimized probe set covering 408 of the 451 exons from the GoFlag 451 probe set that is commercialized by RAPiD Genomics.


    A target enrichment approach using the new probe set provides a relatively low‐cost solution to obtain large‐scale nuclear sequence data for inferring phylogenetic relationships across flagellate land plants.

    more » « less