skip to main content


This content will become publicly available on December 1, 2024

Title: Pest status, molecular evolution, and epigenetic factors derived from the genome assembly of Frankliniella fusca, a thysanopteran phytovirus vector
Abstract Background The tobacco thrips ( Frankliniella fusca Hinds; family Thripidae; order Thysanoptera) is an important pest that can transmit viruses such as the tomato spotted wilt orthotospovirus to numerous economically important agricultural row crops and vegetables. The structural and functional genomics within the order Thysanoptera has only begun to be explored. Within the > 7000 known thysanopteran species, the melon thrips ( Thrips palmi Karny) and the western flower thrips ( Frankliniella occidentalis Pergrande) are the only two thysanopteran species with assembled genomes. Results A genome of F. fusca was assembled by long-read sequencing of DNA from an inbred line. The final assembly size was 370 Mb with a single copy ortholog completeness of ~ 99% with respect to Insecta. The annotated genome of F. fusca was compared with the genome of its congener, F. occidentalis . Results revealed many instances of lineage-specific differences in gene content. Analyses of sequence divergence between the two Frankliniella species’ genomes revealed substitution patterns consistent with positive selection in ~ 5% of the protein-coding genes with 1:1 orthologs. Further, gene content related to its pest status, such as xenobiotic detoxification and response to an ambisense-tripartite RNA virus (orthotospovirus) infection was compared with F. occidentalis . Several F. fusca genes related to virus infection possessed signatures of positive selection. Estimation of CpG depletion, a mutational consequence of DNA methylation, revealed that F. fusca genes that were downregulated and alternatively spliced in response to virus infection were preferentially targeted by DNA methylation. As in many other insects, DNA methylation was enriched in exons in Frankliniella , but gene copies with homology to DNA methyltransferase 3 were numerous and fragmented. This phenomenon seems to be relatively unique to thrips among other insect groups. Conclusions The F. fusca genome assembly provides an important resource for comparative genomic analyses of thysanopterans. This genomic foundation allows for insights into molecular evolution, gene regulation, and loci important to agricultural pest status.  more » « less
Award ID(s):
1755130
NSF-PAR ID:
10433391
Author(s) / Creator(s):
; ; ; ; ;
Date Published:
Journal Name:
BMC Genomics
Volume:
24
Issue:
1
ISSN:
1471-2164
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Abstract Background The western flower thrips, Frankliniella occidentalis (Pergande), is a globally invasive pest and plant virus vector on a wide array of food, fiber, and ornamental crops. The underlying genetic mechanisms of the processes governing thrips pest and vector biology, feeding behaviors, ecology, and insecticide resistance are largely unknown. To address this gap, we present the F. occidentalis draft genome assembly and official gene set. Results We report on the first genome sequence for any member of the insect order Thysanoptera. Benchmarking Universal Single-Copy Ortholog (BUSCO) assessments of the genome assembly (size = 415.8 Mb, scaffold N50 = 948.9 kb) revealed a relatively complete and well-annotated assembly in comparison to other insect genomes. The genome is unusually GC-rich (50%) compared to other insect genomes to date. The official gene set (OGS v1.0) contains 16,859 genes, of which ~ 10% were manually verified and corrected by our consortium. We focused on manual annotation, phylogenetic, and expression evidence analyses for gene sets centered on primary themes in the life histories and activities of plant-colonizing insects. Highlights include the following: (1) divergent clades and large expansions in genes associated with environmental sensing (chemosensory receptors) and detoxification (CYP4, CYP6, and CCE enzymes) of substances encountered in agricultural environments; (2) a comprehensive set of salivary gland genes supported by enriched expression; (3) apparent absence of members of the IMD innate immune defense pathway; and (4) developmental- and sex-specific expression analyses of genes associated with progression from larvae to adulthood through neometaboly, a distinct form of maturation differing from either incomplete or complete metamorphosis in the Insecta. Conclusions Analysis of the F. occidentalis genome offers insights into the polyphagous behavior of this insect pest that finds, colonizes, and survives on a widely diverse array of plants. The genomic resources presented here enable a more complete analysis of insect evolution and biology, providing a missing taxon for contemporary insect genomics-based analyses. Our study also offers a genomic benchmark for molecular and evolutionary investigations of other Thysanoptera species. 
    more » « less
  2. Abstract

    Leafhoppers comprise over 20,000 plant‐sap feeding species, many of which are important agricultural pests. Most species rely on two ancestral bacterial symbionts,SulciaandNasuia, for essential nutrition lacking in their phloem and xylem plant sap diets. To understand how pest leafhopper genomes evolve and are shaped by microbial symbioses, we completed a chromosomal‐level assembly of the aster leafhopper's genome (ALF;Macrosteles quadrilineatus). We compared ALF's genome to three other pest leafhoppers,Nephotettix cincticeps,Homalodisca vitripennis, andEmpoasca onukii, which have distinct ecologies and symbiotic relationships. Despite diverging ~155 million years ago, leafhoppers have high levels of chromosomal synteny and gene family conservation. Conserved genes include those involved in plant chemical detoxification, resistance to various insecticides, and defence against environmental stress. Positive selection acting upon these genes further points to ongoing adaptive evolution in response to agricultural environments. In relation to leafhoppers' general dependence on symbionts, species that retain the ancestral symbiont,Sulcia, displayed gene enrichment of metabolic processes in their genomes. Leafhoppers with bothSulciaand its ancient partner,Nasuia, showed genomic enrichment in genes related to microbial population regulation and immune responses. Finally, horizontally transferred genes (HTGs) associated with symbiont support ofSulciaandNasuiaare only observed in leafhoppers that maintain symbionts. In contrast, HTGs involved in non‐symbiotic functions are conserved across all species. The high‐quality ALF genome provides deep insights into how host ecology and symbioses shape genome evolution and a wealth of genetic resources for pest control targets.

     
    more » « less
  3. ABSTRACT Viral infection exerts selection pressure on marine microbes, as virus-induced cell lysis causes 20 to 50% of cell mortality, resulting in fluxes of biomass into oceanic dissolved organic matter. Archaeal and bacterial populations can defend against viral infection using the clustered regularly interspaced short palindromic repeat (CRISPR)-associated (Cas) system, which relies on specific matching between a spacer sequence and a viral gene. If a CRISPR spacer match to any gene within a viral genome is equally effective in preventing lysis, no viral genes should be preferentially matched by CRISPR spacers. However, if there are differences in effectiveness, certain viral genes may demonstrate a greater frequency of CRISPR spacer matches. Indeed, homology search analyses of bacterioplankton CRISPR spacer sequences against virioplankton sequences revealed preferential matching of replication proteins, nucleic acid binding proteins, and viral structural proteins. Positive selection pressure for effective viral defense is one parsimonious explanation for these observations. CRISPR spacers from virioplankton metagenomes preferentially matched methyltransferase and phage integrase genes within virioplankton sequences. These virioplankton CRISPR spacers may assist infected host cells in defending against competing phage. Analyses also revealed that half of the spacer-matched viral genes were unknown, some genes matched several spacers, and some spacers matched multiple genes, a many-to-many relationship. Thus, CRISPR spacer matching may be an evolutionary algorithm, agnostically identifying those genes under stringent selection pressure for sustaining viral infection and lysis. Investigating this subset of viral genes could reveal those genetic mechanisms essential to virus-host interactions and provide new technologies for optimizing CRISPR defense in beneficial microbes. IMPORTANCE The CRISPR-Cas system is one means by which bacterial and archaeal populations defend against viral infection which causes 20 to 50% of cell mortality in the ocean. We tested the hypothesis that certain viral genes are preferentially targeted for the initial attack of the CRISPR-Cas system on a viral genome. Using CASC, a pipeline for CRISPR spacer discovery, and metagenome data from oceanic microbes and viruses, we found a clear subset of viral genes with high match frequencies to CRISPR spacers. Moreover, we observed a many-to-many relationship of spacers and viral genes. These high-match viral genes were involved in nucleotide metabolism, DNA methylation, and viral structure. It is possible that CRISPR spacer matching is an evolutionary algorithm pointing to those viral genes most important to sustaining infection and lysis. Studying these genes may advance the understanding of virus-host interactions in nature and provide new technologies for leveraging CRISPR-Cas systems in beneficial microbes. 
    more » « less
  4. SUMMARY

    The Pacific crabapple (Malus fusca) is a wild relative of the commercial apple (Malus×domestica). With a range extending from Alaska to Northern California,M. fuscais extremely hardy and disease resistant. The species represents an untapped genetic resource for the development of new apple cultivars with enhanced stress resistance. However, gene discovery and utilization ofM. fuscahave been hampered by the lack of genomic resources. Here, we present a high‐quality, haplotype‐resolved, chromosome‐scale genome assembly and annotation forM. fusca. The genome was assembled using high‐fidelity long‐reads and scaffolded using genetic maps and high‐throughput chromatin conformation capture sequencing, resulting in one of the most contiguous apple genomes to date. We annotated the genome using public transcriptomic data from the same species taken from diverse plant structures and developmental stages. Using this assembly, we explored haplotypic structural variation within the genome ofM. fusca, identifying thousands of large variants. We further showed high sequence co‐linearity with other domesticated and wildMalusspecies. Finally, we resolve a known quantitative trait locus associated with resistance to fire blight (Erwinia amylovora). Insights gained from the assembly of a reference‐quality genome of this hardy wild apple relative will be invaluable as a tool to facilitate DNA‐informed introgression breeding.

     
    more » « less
  5. INTRODUCTION Transposable elements (TEs), repeat expansions, and repeat-mediated structural rearrangements play key roles in chromosome structure and species evolution, contribute to human genetic variation, and substantially influence human health through copy number variants, structural variants, insertions, deletions, and alterations to gene transcription and splicing. Despite their formative role in genome stability, repetitive regions have been relegated to gaps and collapsed regions in human genome reference GRCh38 owing to the technological limitations during its development. The lack of linear sequence in these regions, particularly in centromeres, resulted in the inability to fully explore the repeat content of the human genome in the context of both local and regional chromosomal environments. RATIONALE Long-read sequencing supported the complete, telomere-to-telomere (T2T) assembly of the pseudo-haploid human cell line CHM13. This resource affords a genome-scale assessment of all human repetitive sequences, including TEs and previously unknown repeats and satellites, both within and outside of gaps and collapsed regions. Additionally, a complete genome enables the opportunity to explore the epigenetic and transcriptional profiles of these elements that are fundamental to our understanding of chromosome structure, function, and evolution. Comparative analyses reveal modes of repeat divergence, evolution, and expansion or contraction with locus-level resolution. RESULTS We implemented a comprehensive repeat annotation workflow using previously known human repeats and de novo repeat modeling followed by manual curation, including assessing overlaps with gene annotations, segmental duplications, tandem repeats, and annotated repeats. Using this method, we developed an updated catalog of human repetitive sequences and refined previous repeat annotations. We discovered 43 previously unknown repeats and repeat variants and characterized 19 complex, composite repetitive structures, which often carry genes, across T2T-CHM13. Using precision nuclear run-on sequencing (PRO-seq) and CpG methylated sites generated from Oxford Nanopore Technologies long-read sequencing data, we assessed RNA polymerase engagement across retroelements genome-wide, revealing correlations between nascent transcription, sequence divergence, CpG density, and methylation. These analyses were extended to evaluate RNA polymerase occupancy for all repeats, including high-density satellite repeats that reside in previously inaccessible centromeric regions of all human chromosomes. Moreover, using both mapping-dependent and mapping-independent approaches across early developmental stages and a complete cell cycle time series, we found that engaged RNA polymerase across satellites is low; in contrast, TE transcription is abundant and serves as a boundary for changes in CpG methylation and centromere substructure. Together, these data reveal the dynamic relationship between transcriptionally active retroelement subclasses and DNA methylation, as well as potential mechanisms for the derivation and evolution of new repeat families and composite elements. Focusing on the emerging T2T-level assembly of the HG002 X chromosome, we reveal that a high level of repeat variation likely exists across the human population, including composite element copy numbers that affect gene copy number. Additionally, we highlight the impact of repeats on the structural diversity of the genome, revealing repeat expansions with extreme copy number differences between humans and primates while also providing high-confidence annotations of retroelement transduction events. CONCLUSION The comprehensive repeat annotations and updated repeat models described herein serve as a resource for expanding the compendium of human genome sequences and reveal the impact of specific repeats on the human genome. In developing this resource, we provide a methodological framework for assessing repeat variation within and between human genomes. The exhaustive assessment of the transcriptional landscape of repeats, at both the genome scale and locally, such as within centromeres, sets the stage for functional studies to disentangle the role transcription plays in the mechanisms essential for genome stability and chromosome segregation. Finally, our work demonstrates the need to increase efforts toward achieving T2T-level assemblies for nonhuman primates and other species to fully understand the complexity and impact of repeat-derived genomic innovations that define primate lineages, including humans. Telomere-to-telomere assembly of CHM13 supports repeat annotations and discoveries. The human reference T2T-CHM13 filled gaps and corrected collapsed regions (triangles) in GRCh38. Combining long read–based methylation calls, PRO-seq, and multilevel computational methods, we provide a compendium of human repeats, define retroelement expression and methylation profiles, and delineate locus-specific sites of nascent transcription genome-wide, including previously inaccessible centromeres. SINE, short interspersed element; SVA, SINE–variable number tandem repeat– Alu ; LINE, long interspersed element; LTR, long terminal repeat; TSS, transcription start site; pA, xxxxxxxxxxxxxxxx. 
    more » « less