skip to main content


Title: Chromosome-level genome assembly of Euphorbia peplus , a model system for plant latex, reveals that relative lack of Ty3 transposons contributed to its small genome size
Abstract Euphorbia peplus (petty spurge) is a small, fast-growing plant that is native to Eurasia and has become a naturalized weed in North America and Australia. E. peplus is not only medicinally valuable, serving as a source for the skin cancer drug ingenol mebutate, but also has great potential as a model for latex production owing to its small size, ease of manipulation in the laboratory, and rapid reproductive cycle. To help establish E. peplus as a new model, we generated a 267.2 Mb Hi-C-anchored PacBio HiFi nuclear genome assembly with an BUSCO score of 98.5%, a genome annotation based on RNA-seq data from six organs, and publicly accessible tools including a genome browser and an interactive organ-specific expression atlas. Chromosome number is highly variable across Euphorbia species. Using a comparative analysis of our newly sequenced E. peplus genome with other Euphorbiaceae genomes, we show that variation in Euphorbia chromosome number between E. peplus and E. lathyris is likely due to fragmentation and rearrangement rather than chromosomal duplication followed by diploidization of the duplicated sequence. Moreover, we found that the E. peplus genome is relatively compact compared to related members of the genus in part due to restricted expansion of the Ty3 transposon family. Finally, we identify a large gene cluster that contains many previously identified enzymes in the putative ingenol mebutate biosynthesis pathway, along with additional gene candidates for this biosynthetic pathway. The genomic resources we have created for E. peplus will help advance research on latex production and ingenol mebutate biosynthesis in the commercially important Euphorbiaceae family.  more » « less
Award ID(s):
1942437
PAR ID:
10399409
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ;
Editor(s):
Slotte, Tanja
Date Published:
Journal Name:
Genome Biology and Evolution
ISSN:
1759-6653
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Senna torais a widely used medicinal plant. Its health benefits have been attributed to the large quantity of anthraquinones, but how they are made in plants remains a mystery. To identify the genes responsible for plant anthraquinone biosynthesis, we reveal the genome sequence ofS. toraat the chromosome level with 526 Mb (96%) assembled into 13 chromosomes. Comparison among related plant species shows that a chalcone synthase-like (CHS-L) gene family has lineage-specifically and rapidly expanded inS. tora. Combining genomics, transcriptomics, metabolomics, and biochemistry, we identify a CHS-L gene contributing to the biosynthesis of anthraquinones. TheS. torareference genome will accelerate the discovery of biologically active anthraquinone biosynthesis pathways in medicinal plants.

     
    more » « less
  2. Within the arachnids, chromosome-level genome assemblies have greatly accelerated the understanding of gene family evolution and developmental genomics in key groups, such as spiders (Araneae), mites and ticks (Acariformes and Parasitiformes). Among other poorly studied arachnid orders that lack genome assemblies altogether are the clade Pedipalpi, which is comprised of three orders that form the sister group of spiders, which diverged over 400 Mya. We close this gap by generating the first chromosome-level assembly from a single specimen of the vinegaroon Mastigoproctus giganteus (Uropygi). We show that this highly complete genome retains plesiomorphic conditions for many gene families that have undergone lineage-specific derivations within the more diverse spiders. Consistent with the phylogenetic position of Uropygi, macrosynteny in the M. giganteus genome substantiates the signature of an ancient whole genome duplication. 
    more » « less
  3. Summary

    White oak (Quercus alba) is an abundant forest tree species across eastern North America that is ecologically, culturally, and economically important.

    We report the first haplotype‐resolved chromosome‐scale genome assembly ofQ. albaand conduct comparative analyses of genome structure and gene content against other published Fagaceae genomes. We investigate the genetic diversity of this widespread species and the phylogenetic relationships among oaks using whole genome data.

    Despite strongly conserved chromosome synteny and genome size acrossQuercus, certain gene families have undergone rapid changes in size, including defense genes. Unbiased annotation of resistance (R) genes across oaks revealed that the overall number of R genes is similar across species – as are the chromosomal locations of R gene clusters – but, gene number within clusters is more labile. We found thatQ. albahas high genetic diversity, much of which predates its divergence from other oaks and likely impacts divergence time estimations. Our phylogenetic results highlight widespread phylogenetic discordance across the genus.

    The white oak genome represents a major new resource for studying genome diversity and evolution inQuercus. Additionally, we show that unbiased gene annotation is key to accurately assessing R gene evolution inQuercus.

     
    more » « less
  4. Abstract

    As a model organism for studies of cell and environmental biology, the free‐living and cosmopolitan ciliateEuplotes vannusshows intriguing features like dual genome architecture (i.e., separate germline and somatic nuclei in each cell/organism), “gene‐sized” chromosomes, stop codon reassignment, programmed ribosomal frameshifting (PRF) and strong resistance to environmental stressors. However, the molecular mechanisms that account for these remarkable traits remain largely unknown. Here we report a combined analysis of de novo assembled high‐quality macronuclear (MAC; i.e., somatic) and partial micronuclear (MIC; i.e., germline) genome sequences forE. vannus, and transcriptome profiling data under varying conditions. The results demonstrate that: (a) the MAC genome contains more than 25,000 complete “gene‐sized” nanochromosomes (~85 Mb haploid genome size) with the N50 ~2.7 kb; (b) although there is a high frequency of frameshifting at stop codons UAA and UAG, we did not observe impaired transcript abundance as a result of PRF in this species as has been reported for other euplotids; (c) the sequence motif 5′‐TA‐3′ is conserved at nearly all internally‐eliminated sequence (IES) boundaries in the MIC genome, and chromosome breakage sites (CBSs) are duplicated and retained in the MAC genome; (d) by profiling the weighted correlation network of genes in the MAC under different environmental stressors, including nutrient scarcity, extreme temperature, salinity and the presence of ammonia, we identified gene clusters that respond to these external physical or chemical stimulations, and (e) we observed a dramatic increase in HSP70 gene transcription under salinity and chemical stresses but surprisingly, not under temperature changes; we link this temperature‐resistance to the evolved loss of temperature stress‐sensitive elements in regulatory regions. Together with the genome resources generated in this study, which are available online atEuplotes vannusGenome Database (http://evan.ciliate.org), these data provide molecular evidence for understanding the unique biology of highly adaptable microorganisms.

     
    more » « less
  5. Abstract Background

    Teak, a member of the Lamiaceae family, produces one of the most expensive hardwoods in the world. High demand coupled with deforestation have caused a decrease in natural teak forests, and future supplies will be reliant on teak plantations. Hence, selection of teak tree varieties for clonal propagation with superior growth performance is of great importance, and access to high-quality genetic and genomic resources can accelerate the selection process by identifying genes underlying desired traits.

    Findings

    To facilitate teak research and variety improvement, we generated a highly contiguous, chromosomal-scale genome assembly using high-coverage Pacific Biosciences long reads coupled with high-throughput chromatin conformation capture. Of the 18 teak chromosomes, we generated 17 near-complete pseudomolecules with one chromosome present as two chromosome arm scaffolds. Genome annotation yielded 31,168 genes encoding 46,826 gene models, of which, 39,930 and 41,155 had Pfam domain and expression evidence, respectively. We identified 14 clusters of tandem-duplicated terpene synthases (TPSs), genes central to the biosynthesis of terpenes, which are involved in plant defense and pollinator attraction. Transcriptome analysis revealed 10 TPSs highly expressed in woody tissues, of which, 8 were in tandem, revealing the importance of resolving tandemly duplicated genes and the quality of the assembly and annotation. We also validated the enzymatic activity of four TPSs to demonstrate the function of key TPSs.

    Conclusions

    In summary, this high-quality chromosomal-scale assembly and functional annotation of the teak genome will facilitate the discovery of candidate genes related to traits critical for sustainable production of teak and for anti-insecticidal natural products.

     
    more » « less