skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: A pooled‐sample draft genome assembly provides insights into host plant‐specific transcriptional responses of a Solanaceae‐specializing pest, Tupiocoris notatus (Hemiptera: Miridae)
Abstract The assembly of genomes from pooled samples of genetically heterogenous samples of conspecifics remains challenging. In this study, we show that high‐quality genome assemblies can be produced from samples of multiple wild‐caught individuals. We sequenced DNA extracted from a pooled sample of conspecific herbivorous insects (Hemiptera: Miridae:Tupiocoris notatus) acquired from a greenhouse infestation in Tucson, Arizona (in the range of 30–100 individuals; 0.5 mL tissue by volume) using PacBio highly accurate long reads (HiFi). The initial assembly contained multiple haplotigs (>85% BUSCOs duplicated), but duplicate contigs could be easily purged to reveal a highly complete assembly (95.6% BUSCO, 4.4% duplicated) that is highly contiguous by short‐read assembly standards (N50 = 675 kb; Largest contig = 4.3 Mb). We then used our assembly as the basis for a genome‐guided differential expression study of host plant‐specific transcriptional responses. We found thousands of genes (N = 4982) to be differentially expressed between our new data from individuals feeding onDatura wrightii(Solanaceae) and existing RNA‐seq data fromNicotiana attenuata(Solanaceae)‐fed individuals. We identified many of these genes as previously documented detoxification genes such as glutathione‐S‐transferases, cytochrome P450s, and UDP‐glucosyltransferases. Together our results show that long‐read sequencing of pooled samples can provide a cost‐effective genome assembly option for small insects and can provide insights into the genetic mechanisms underlying interactions between plants and herbivorous pests.  more » « less
Award ID(s):
2010772 2022055
PAR ID:
10507128
Author(s) / Creator(s):
; ; ; ;
Publisher / Repository:
Wiley
Date Published:
Journal Name:
Ecology and Evolution
Volume:
14
Issue:
3
ISSN:
2045-7758
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Eyre-Walker, Adam (Ed.)
    Abstract Our knowledge of the Major Histocompatibility Complex (MHC) in birds is limited because it often consists of numerous duplicated genes within individuals that are difficult to assemble with short read sequencing technology. Long-read sequencing provides an opportunity to overcome this limitation because it allows the assembly of long regions with repetitive elements. In this study, we used genomes based on long-read sequencing to predict the number and location of MHC loci in a broad range of bird taxa. From the long-read-based genomes of 34 species, we found that there was extremely large variation in the number of MHC loci between species. Overall, there were greater numbers of both class I and II loci in passerines than nonpasserines. The highest numbers of loci (up to 193 class II loci) were found in manakins (Pipridae), which had previously not been studied at the MHC. Our results provide the first direct evidence from passerine genomes of this high level of duplication. We also found different duplication patterns between species. In some species, both MHC class I and II genes were duplicated together, whereas in most species they were duplicated independently. Our study shows that the analysis of long-read-based genomes can dramatically improve our knowledge of MHC structure, although further improvements in chromosome level assembly are needed to understand the evolutionary mechanisms producing the extraordinary interspecific variation in the architecture of the MHC region. 
    more » « less
  2. IntroductionNosemais a diverse genus of unicellular microsporidian parasites of insects and other arthropods.Nosema muscidifuracisinfects parasitoid wasp species ofMuscidifurax zaraptorandM. raptor(Hymenoptera: Pteromalidae), causing ~50% reduction in longevity and ~90% reduction in fecundity. Methods and ResultsHere, we report the first assembly of theN. muscidifuracisgenome (14,397,169 bp in 28 contigs) of high continuity (contig N50 544.3 Kb) and completeness (BUSCO score 97.0%). A total of 2,782 protein-coding genes were annotated, with 66.2% of the genes having two copies and 24.0% of genes having three copies. These duplicated genes are highly similar, with a sequence identity of 99.3%. The complex pattern suggests extensive gene duplications and rearrangements across the genome. We annotated 57 rDNA loci, which are highly GC-rich (37%) in a GC-poor genome (25% genome average).Nosema-specific qPCR primer sets were designed based on 18S rDNA annotation as a diagnostic tool to determine its titer in host samples. We discovered highNosematiters inNosema-curedM. raptorandM. zaraptorusing heat treatment in 2017 and 2019, suggesting that the remedy did not completely eliminate theNosemainfection. Cytogenetic analyses revealed heavy infections ofN. muscidifuraciswithin the ovaries ofM. raptorandM. zaraptor, consistent with the titer determined by qPCR and suggesting a heritable component of infection and per ovum vertical transmission. DiscussionThe parasitoids-Nosemasystem is laboratory tractable and, therefore, can serve as a model to inform future genome manipulations ofNosema-host system for investigations of Nosemosis. 
    more » « less
  3. Abstract Nesidiocoris tenuis(Reuter) is an efficient predatory biological control agent used throughout the Mediterranean Basin in tomato crops but regarded as a pest in northern European countries. From the family Miridae, it is an economically important insect yet very little is known in terms of genetic information and no genomic or transcriptomic studies have been published. Here, we use a linked‐read sequencing strategy on a single femaleN. tenuis. From this, we assembled the 355 Mbp genome and delivered anab initio, homology‐based and evidence‐based annotation. Along the way, the bacterial “contamination” was removed from the assembly. In addition, bacterial lateral gene transfer (LGT) candidates were detected in theN. tenuisgenome. The complete gene set is composed of 24 688 genes; the associated proteins were compared to other hemipterans (Cimex lectularis,Halyomorpha halysandAcyrthosiphon pisum). We visualized the genome using various cytogenetic techniques, such as karyotyping, CGH and GISH, indicating a karyotype of 2n= 32. Additional analyses include the localization of 18S rDNA and unique satellite probes as well as pooled sequencing to assess nucleotide diversity and neutrality of the commercial population. This is one of the first mirid genomes to be released and the first of a mirid biological control agent. 
    more » « less
  4. Abstract For any genome-based research, a robust genome assembly is required. De novo assembly strategies have evolved with changes in DNA sequencing technologies and have been through at least three phases: i) short-read only, ii) short- and long-read hybrid, and iii) long-read only assemblies. Each of the phases has their own error model. We hypothesized that hidden scaffolding errors in short-read assembly and erroneous long-read contigs degrades the quality of short- and long-read hybrid assemblies. We assembled the genome of T. borchgrevinki from data generated during each of the three phases and assessed the quality problems we encountered. We developed strategies such as k-mer-assembled region replacement, parameter optimization, and long-read sampling to address the error models. We demonstrated that a k-mer based strategy improved short-read assemblies as measured by BUSCO while mate-pair libraries introduced hidden scaffolding errors and perturbed BUSCO scores. Further, we found that although hybrid assemblies can generate higher contiguity they tend to suffer from lower quality. In addition, we found long-read only assemblies can be optimized for contiguity by sub-sampling length-restricted raw reads. Our results indicate that long-read contig assembly is the current best choice and that assemblies from phase I and phase II were of lower quality. 
    more » « less
  5. Abstract The sacred datura plant (Solanales: Solanaceae:Datura wrightii) has been used to study plant–herbivore interactions for decades. The wealth of information that has resulted leads it to have potential as a model system for studying the ecological and evolutionary genomics of these interactions. We present a de novoDatura wrightiigenome assembled using PacBio HiFi long-reads. Our assembly is highly complete and contiguous (N50 = 179Mb, BUSCO Complete = 97.6%). We successfully detected a previously documented ancient whole genome duplication using our assembly and have classified the gene duplication history that generated its coding sequence content. We use it as the basis for a genome-guided differential expression analysis to identify the induced responses of this plant to one of its specialized herbivores (Coleoptera: Chrysomelidae:Lema daturaphila). We find over 3000 differentially expressed genes associated with herbivory and that elevated expression levels of over 200 genes last for several days. We also combined our analyses to determine the role that different gene duplication categories have played in the evolution ofDatura-herbivore interactions. We find that tandem duplications have expanded multiple functional groups of herbivore responsive genes with defensive functions, including UGT-glycosyltranserases, oxidoreductase enzymes, and peptidase inhibitors. Overall, our results expand our knowledge of herbivore-induced plant transcriptional responses and the evolutionary history of the underlying herbivore-response genes. 
    more » « less