skip to main content

Title: Phylogenomics and the evolution of hemipteroid insects

Hemipteroid insects (Paraneoptera), with over 10% of all known insect diversity, are a major component of terrestrial and aquatic ecosystems. Previous phylogenetic analyses have not consistently resolved the relationships among major hemipteroid lineages. We provide maximum likelihood-based phylogenomic analyses of a taxonomically comprehensive dataset comprising sequences of 2,395 single-copy, protein-coding genes for 193 samples of hemipteroid insects and outgroups. These analyses yield a well-supported phylogeny for hemipteroid insects. Monophyly of each of the three hemipteroid orders (Psocodea, Thysanoptera, and Hemiptera) is strongly supported, as are most relationships among suborders and families. Thysanoptera (thrips) is strongly supported as sister to Hemiptera. However, as in a recent large-scale analysis sampling all insect orders, trees from our data matrices support Psocodea (bark lice and parasitic lice) as the sister group to the holometabolous insects (those with complete metamorphosis). In contrast, four-cluster likelihood mapping of these data does not support this result. A molecular dating analysis using 23 fossil calibration points suggests hemipteroid insects began diversifying before the Carboniferous, over 365 million years ago. We also explore implications for understanding the timing of diversification, the evolution of morphological traits, and the evolution of mitochondrial genome organization. These results provide a phylogenetic framework for future more » studies of the group.

« less
; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; more » ; ; ; ; ; ; ; « less
Publication Date:
Journal Name:
Proceedings of the National Academy of Sciences
Page Range or eLocation-ID:
p. 12775-12780
Proceedings of the National Academy of Sciences
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Relationships among the major lineages of Mollusca have long been debated. Morphological studies have considered the rarely collected Monoplacophora (Tryblidia) to have several plesiomorphic molluscan traits. The phylogenetic position of this group is contentious as morphologists have generally placed this clade as the sister taxon of the rest of Conchifera whereas earlier molecular studies supported a clade of Monoplacophora + Polyplacophora (Serialia) and phylogenomic studies have generally recovered a clade of Monoplacophora + Cephalopoda. Phylogenomic studies have also strongly supported a clade including Gastropoda, Bivalvia, and Scaphopoda, but relationships among these taxa have been inconsistent. In order to resolve conchiferan relationships and improve understanding of early molluscan evolution, we carefully curated a high-quality data matrix and conducted phylogenomic analyses with broad taxon sampling including newly sequenced genomic data from the monoplacophoranLaevipilina antarctica. Whereas a partitioned maximum likelihood (ML) analysis using site-homogeneous models recovered Monoplacophora sister to Cephalopoda with moderate support, both ML and Bayesian inference (BI) analyses using mixture models recovered Monoplacophora sister to all other conchiferans with strong support. A supertree approach also recovered Monoplacophora as the sister taxon of a clade composed of the rest of Conchifera. Gastropoda was recovered as the sister taxon of Scaphopoda in most analyses, which wasmore »strongly supported when mixture models were used. A molecular clock based on our BI topology dates diversification of Mollusca to ~546 MYA (+/− 6 MYA) and Conchifera to ~540 MYA (+/− 9 MYA), generally consistent with previous work employing nuclear housekeeping genes. These results provide important resolution of conchiferan mollusc phylogeny and offer new insights into ancestral character states of major mollusc clades.

    « less
  2. Yoshizawa, Kazunori (Ed.)
    Abstract The order Psocodea includes the two historically recognized groups Psocoptera (free-living bark lice) and Phthiraptera (parasitic lice) that were once considered separate orders. Psocodea is divided in three suborders: Trogiomorpha, Troctomorpha, and Psocomorpha, the latter being the largest within the free-living groups. Despite the increasing number of transcriptomes and whole genome sequence (WGS) data available for this group, the relationships among the six known infraorders within Psocomorpha remain unclear. Here, we evaluated the utility of a bait set designed specifically for parasitic lice belonging to suborder Troctomorpha to extract UCE loci from transcriptome and WGS data of 55 bark louse species and explored the phylogenetic relationships within Psocomorpha using these UCE loci markers. Taxon sampling was heavily focused on the families Lachesillidae and Elipsocidae, whose relationships have been problematic in prior phylogenetic studies. We successfully recovered a total of 2,622 UCE loci, with a 40% completeness matrix containing 2,081 UCE loci and an 80% completeness matrix containing 178 UCE loci. The average number of UCE loci recovered for the 55 species was 1,401. The WGS data sets produced a larger number of UCE loci (1,495) on average than the transcriptome data sets (972). Phylogenetic relationships reconstructed with Maximum Likelihoodmore »and coalescent-based analysis were concordant regarding the paraphyly of Lachesillidae and Elipsocidae. Branch support values were generally lower in analyses that used a fewer number of loci, even though they had higher matrix completeness.« less
  3. Buckley, Thomas (Ed.)
    Abstract The insect order Psocodea is a diverse lineage comprising both parasitic (Phthiraptera) and nonparasitic members (Psocoptera). The extreme age and ecological diversity of the group may be associated with major genomic changes, such as base compositional biases expected to affect phylogenetic inference. Divergent morphology between parasitic and nonparasitic members has also obscured the origins of parasitism within the order. We conducted a phylogenomic analysis on the order Psocodea utilizing both transcriptome and genome sequencing to obtain a data set of 2370 orthologous genes. All phylogenomic analyses, including both concatenated and coalescent methods suggest a single origin of parasitism within the order Psocodea, resolving conflicting results from previous studies. This phylogeny allows us to propose a stable ordinal level classification scheme that retains significant taxonomic names present in historical scientific literature and reflects the evolution of the group as a whole. A dating analysis, with internal nodes calibrated by fossil evidence, suggests an origin of parasitism that predates the K-Pg boundary. Nucleotide compositional biases are detected in third and first codon positions and result in the anomalous placement of the Amphientometae as sister to Psocomorpha when all nucleotide sites are analyzed. Likelihood-mapping and quartet sampling methods demonstrate that base compositionalmore »biases can also have an effect on quartet-based methods.[Illumina; Phthiraptera; Psocoptera; quartet sampling; recoding methods.]« less
  4. Abstract Background The most species-rich radiation of animal life in the 66 million years following the Cretaceous extinction event is that of schizophoran flies: a third of fly diversity including Drosophila fruit fly model organisms, house flies, forensic blow flies, agricultural pest flies, and many other well and poorly known true flies. Rapid diversification has hindered previous attempts to elucidate the phylogenetic relationships among major schizophoran clades. A robust phylogenetic hypothesis for the major lineages containing these 55,000 described species would be critical to understand the processes that contributed to the diversity of these flies. We use protein encoding sequence data from transcriptomes, including 3145 genes from 70 species, representing all superfamilies, to improve the resolution of this previously intractable phylogenetic challenge. Results Our results support a paraphyletic acalyptrate grade including a monophyletic Calyptratae and the monophyly of half of the acalyptrate superfamilies. The primary branching framework of Schizophora is well supported for the first time, revealing the primarily parasitic Pipunculidae and Sciomyzoidea stat. rev. as successive sister groups to the remaining Schizophora. Ephydroidea, Drosophila ’s superfamily, is the sister group of Calyptratae. Sphaeroceroidea has modest support as the sister to all non-sciomyzoid Schizophora. We define two novel lineages corroboratedmore »by morphological traits, the ‘Modified Oviscapt Clade’ containing Tephritoidea, Nerioidea, and other families, and the ‘Cleft Pedicel Clade’ containing Calyptratae, Ephydroidea, and other families. Support values remain low among a challenging subset of lineages, including Diopsidae. The placement of these families remained uncertain in both concatenated maximum likelihood and multispecies coalescent approaches. Rogue taxon removal was effective in increasing support values compared with strategies that maximise gene coverage or minimise missing data. Conclusions Dividing most acalyptrate fly groups into four major lineages is supported consistently across analyses. Understanding the fundamental branching patterns of schizophoran flies provides a foundation for future comparative research on the genetics, ecology, and biocontrol.« less
  5. Abstract

    Contamination of a genetic sample with DNA from one or more nontarget species is a continuing concern of molecular phylogenetic studies, both Sanger sequencing studies and next-generation sequencing studies. We developed an automated pipeline for identifying and excluding likely cross-contaminated loci based on the detection of bimodal distributions of patristic distances across gene trees. When contamination occurs between samples within a data set, a comparison between a contaminated sample and its contaminant taxon will yield bimodal distributions with one peak close to zero patristic distance. This new method does not rely on a priori knowledge of taxon relatedness nor does it determine the causes(s) of the contamination. Exclusion of putatively contaminated loci from a data set generated for the insect family Cicadidae showed that these sequences were affecting some topological patterns and branch supports, although the effects were sometimes subtle, with some contamination-influenced relationships exhibiting strong bootstrap support. Long tip branches and outlier values for one anchored phylogenomic pipeline statistic (AvgNHomologs) were correlated with the presence of contamination. While the anchored hybrid enrichment markers used here, which target hemipteroid taxa, proved effective in resolving deep and shallow level Cicadidae relationships in aggregate, individual markers contained inadequate phylogenetic signal, inmore »part probably due to short length. The cleaned data set, consisting of 429 loci, from 90 genera representing 44 of 56 current Cicadidae tribes, supported three of the four sampled Cicadidae subfamilies in concatenated-matrix maximum likelihood (ML) and multispecies coalescent-based species tree analyses, with the fourth subfamily weakly supported in the ML trees. No well-supported patterns from previous family-level Sanger sequencing studies of Cicadidae phylogeny were contradicted. One taxon (Aragualna plenalinea) did not fall with its current subfamily in the genetic tree, and this genus and its tribe Aragualnini is reclassified to Tibicininae following morphological re-examination. Only subtle differences were observed in trees after the removal of loci for which divergent base frequencies were detected. Greater success may be achieved by increased taxon sampling and developing a probe set targeting a more recent common ancestor and longer loci. Searches for contamination are an essential step in phylogenomic analyses of all kinds and our pipeline is an effective solution. [Auchenorrhyncha; base-composition bias; Cicadidae; Cicadoidea; Hemiptera; phylogenetic conflict.]

    « less