skip to main content


Title: Chromosome-scale genome assembly of the ‘Munstead’ cultivar of Lavandula angustifolia
Abstract Objectives

Lavandula angustifolia(English lavender) is commercially important not only as an ornamental species but also as a major source of fragrances. To better understand the genomic basis of chemical diversity in lavender, we sequenced, assembled, and annotated the ‘Munstead’ cultivar ofL. angustifolia.

Data description

A total of 80 Gb of Oxford Nanopore Technologies reads was used to assemble the ‘Munstead’ genome using the Canu genome assembler software. Following multiple rounds of error correction and scaffolding using Hi-C data, the final chromosome-scale assembly represents 795,075,733 bp across 25 chromosomes with an N50 scaffold length of 31,371,815 bp. Benchmarking Universal Single Copy Orthologs analysis revealed 98.0% complete orthologs, indicative of a high-quality assembly representative of genic space. Annotation of protein-coding sequences revealed 58,702 high-confidence genes encoding 88,528 gene models. Access to the ‘Munstead’ genome will permit comparative analyses within and among lavender accessions and provides a pivotal species for comparative analyses within Lamiaceae.

 
more » « less
PAR ID:
10479531
Author(s) / Creator(s):
; ; ; ; ; ; ;
Publisher / Repository:
Springer Science + Business Media
Date Published:
Journal Name:
BMC Genomic Data
Volume:
24
Issue:
1
ISSN:
2730-6844
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Objectives

    Petrea volubilis, a member of the Order Lamiales and the Verbenaceae family, is an important horticultural species that has been used in traditional folk medicine. To provide a genome sequence for comparative studies within the Order Lamiales that includes important families such as Lamiaceae (mints), we generated a long-read, chromosome-scale genome assembly of this species.

    Data description

    Using a total of 45.5 Gb of Pacific Biosciences long read sequence, we generated a 480.2 Mb assembly ofP. volubilis,of which, 93% is chromosome anchored. Representation of genic regions was robust with 96.6% of the Benchmarking of Universal Single Copy Orthologs present in the genome assembly. A total of 57.8% of the genome was annotated as a repetitive sequence. Using a gene annotation pipeline that included refinement of gene models using transcript evidence, 30,982 high confidence genes were annotated. Access to theP. volubilisgenome will facilitate evolutionary studies in the Lamiales, a key order of Asterids that includes significant crop and medicinal plant species.

     
    more » « less
  2. Introduction

    Nosemais a diverse genus of unicellular microsporidian parasites of insects and other arthropods.Nosema muscidifuracisinfects parasitoid wasp species ofMuscidifurax zaraptorandM. raptor(Hymenoptera: Pteromalidae), causing ~50% reduction in longevity and ~90% reduction in fecundity.

    Methods and Results

    Here, we report the first assembly of theN. muscidifuracisgenome (14,397,169 bp in 28 contigs) of high continuity (contig N50 544.3 Kb) and completeness (BUSCO score 97.0%). A total of 2,782 protein-coding genes were annotated, with 66.2% of the genes having two copies and 24.0% of genes having three copies. These duplicated genes are highly similar, with a sequence identity of 99.3%. The complex pattern suggests extensive gene duplications and rearrangements across the genome. We annotated 57 rDNA loci, which are highly GC-rich (37%) in a GC-poor genome (25% genome average).Nosema-specific qPCR primer sets were designed based on 18S rDNA annotation as a diagnostic tool to determine its titer in host samples. We discovered highNosematiters inNosema-curedM. raptorandM. zaraptorusing heat treatment in 2017 and 2019, suggesting that the remedy did not completely eliminate theNosemainfection. Cytogenetic analyses revealed heavy infections ofN. muscidifuraciswithin the ovaries ofM. raptorandM. zaraptor, consistent with the titer determined by qPCR and suggesting a heritable component of infection and per ovum vertical transmission.

    Discussion

    The parasitoids-Nosemasystem is laboratory tractable and, therefore, can serve as a model to inform future genome manipulations ofNosema-host system for investigations of Nosemosis.

     
    more » « less
  3. Abstract Background

    The maize inbred line A188 is an attractive model for elucidation of gene function and improvement due to its high embryogenic capacity and many contrasting traits to the first maize reference genome, B73, and other elite lines. The lack of a genome assembly of A188 limits its use as a model for functional studies.

    Results

    Here, we present a chromosome-level genome assembly of A188 using long reads and optical maps. Comparison of A188 with B73 using both whole-genome alignments and read depths from sequencing reads identify approximately 1.1 Gb of syntenic sequences as well as extensive structural variation, including a 1.8-Mb duplication containing the Gametophyte factor1 locus for unilateral cross-incompatibility, and six inversions of 0.7 Mb or greater. Increased copy number of carotenoid cleavage dioxygenase 1 (ccd1) in A188 is associated with elevated expression during seed development. Highccd1expression in seeds together with low expression of yellow endosperm 1 (y1) reduces carotenoid accumulation, accounting for the white seed phenotype of A188. Furthermore, transcriptome and epigenome analyses reveal enhanced expression of defense pathways and altered DNA methylation patterns of the embryonic callus.

    Conclusions

    The A188 genome assembly provides a high-resolution sequence for a complex genome species and a foundational resource for analyses of genome variation and gene function in maize. The genome, in comparison to B73, contains extensive intra-species structural variations and other genetic differences. Expression and network analyses identify discrete profiles for embryonic callus and other tissues.

     
    more » « less
  4. Abstract

    Vitis riparia, a critically important Native American grapevine species, is used globally in rootstock and scion breeding and contributed to the recovery of the French wine industry during the mid-19th century phylloxera epidemic. This species has abiotic and biotic stress tolerance and the largest natural geographic distribution of the North American grapevine species. Here we report an Illumina short-read 369X coverage, draft de novo heterozygous genome sequence ofV. ripariaMichx. ‘Manitoba 37’ with the size of ~495 Mb for 69,616 scaffolds and a N50 length of 518,740 bp. Using RNAseq data, 40,019 coding sequences were predicted and annotated. Benchmarking with Universal Single-Copy Orthologs (BUSCO) analysis of predicted gene models found 96% of the complete BUSCOs in this assembly. The assembly continuity and completeness were further validated usingV. ripariaESTs, BACs, and three de novo transcriptome assemblies of three differentV. ripariagenotypes resulting in >98% of respective sequences/transcripts mapping with this assembly. Alignment of theV. ripariaassembly and predicted CDS with the latestV. vinifera‘PN40024’ CDS and genome assembly showed 99% CDS alignment and a high degree of synteny. An analysis of plant transcription factors indicates a high degree of homology with theV. viniferatranscription factors. QTL mapping toV. riparia‘Manitoba 37’ andV. viniferaPN40024 has identified genetic relationships to phenotypic variation between species. This assembly provides reference sequences, gene models for marker development and understandingV. riparia’s genetic contributions in grape breeding and research.

     
    more » « less
  5. Abstract

    Salvia hispanicaL. (Chia), a member of the Lamiaceae, is an economically important crop in Mesoamerica, with health benefits associated with its seed fatty acid composition. Chia varieties are distinguished based on seed color including mixed white and black (Chia pinta) and black (Chia negra). To facilitate research on Chia and expand on comparative analyses within the Lamiaceae, we generated a chromosome‐scale assembly of a Chia pinta accession and performed comparative genome analyses with a previously published Chia negra genome assembly. The Chia pinta and Chia negra genome sequences were highly similar as shown by a limited number of single nucleotide polymorphisms and extensive shared orthologous gene membership. However, there is an enrichment of terpene synthases in the Chia pinta genome relative to the Chia negra genome. We sequenced and analyzed the genomes of 20 Chia accessions with differing seed color and geographic origin revealing population structure withinS. hispanicaand interspecific introgressions ofSalviaspecies. As the genusSalviais polyphyletic, its evolutionary history remains unclear. Using large‐scale synteny analysis within the Lamiaceae and orthologous group membership, we resolved the phylogeny ofSalviaspecies. This study and its collective resources further our understanding of genomic diversity in this food crop and the extent of interspecies hybridizations inSalvia.

     
    more » « less