skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: High quality genomes produced from single MinION flow cells clarify polyploid and demographic histories of critically endangered Fraxinus (ash) species
Abstract With populations of threatened and endangered species declining worldwide, efforts are being made to generate high quality genomic records of these species before they are lost forever. Here, we demonstrate that data from single Oxford Nanopore Technologies (ONT) MinION flow cells can, even in the absence of highly accurate short DNA-read polishing, produce high quality de novo plant genome assemblies adequate for downstream analyses, such as synteny and ploidy evaluations, paleodemographic analyses, and phylogenomics. This study focuses on three North American ash tree species in the genusFraxinus(Oleaceae) that were recently added to the International Union for Conservation of Nature (IUCN) Red List as critically endangered. Our results support a hexaploidy event at the base of the Oleaceae as well as a subsequent whole genome duplication shared bySyringa,Osmanthus,Olea, and Fraxinus. Finally, we demonstrate the use of ONT long-read sequencing data to reveal patterns in demographic history.  more » « less
Award ID(s):
2139311
PAR ID:
10484561
Author(s) / Creator(s):
; ; ; ; ; ; ; ;
Publisher / Repository:
Nature Publishing Group
Date Published:
Journal Name:
Communications Biology
Volume:
7
Issue:
1
ISSN:
2399-3642
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The combination of ultra-long (UL) Oxford Nanopore Technologies (ONT) sequencing reads with long, accurate Pacific Bioscience (PacBio) High Fidelity (HiFi) reads has enabled the completion of a human genome and spurred similar efforts to complete the genomes of many other species. However, this approach for complete, “telomere-to-telomere” genome assembly relies on multiple sequencing platforms, limiting its accessibility. ONT “Duplex” sequencing reads, where both strands of the DNA are read to improve quality, promise high per-base accuracy. To evaluate this new data type, we generated ONT Duplex data for three widely studied genomes: human HG002, Solanum lycopersicum Heinz 1706 (tomato), and Zea mays B73 (maize). For the diploid, heterozygous HG002 genome, we also used “Pore-C” chromatin contact mapping to completely phase the haplotypes. We found the accuracy of Duplex data to be similar to HiFi sequencing, but with read lengths tens of kilobases longer, and the Pore-C data to be compatible with existing diploid assembly algorithms. This combination of read length and accuracy enables the construction of a high-quality initial assembly, which can then be further resolved using the UL reads, and finally phased into chromosome-scale haplotypes with Pore-C. The resulting assemblies have a base accuracy exceeding 99.999% (Q50) and near-perfect continuity, with most chromosomes assembled as single contigs. We conclude that ONT sequencing is a viable alternative to HiFi sequencing for de novo genome assembly, and provides a multirun single-instrument solution for the reconstruction of complete genomes. 
    more » « less
  2. null (Ed.)
    Choosing the optimum assembly approach is essential to achieving a high-quality genome assembly suitable for comparative and evolutionary genomic investigations. Significant recent progress in long-read sequencing technologies such as PacBio and Oxford Nanopore Technologies (ONT) has also brought about a large variety of assemblers. Although these have been extensively tested on model species such as Homo sapiens and Drosophila melanogaster , such benchmarking has not been done in Mollusca, which lacks widely adopted model species. Molluscan genomes are notoriously rich in repeats and are often highly heterozygous, making their assembly challenging. Here, we benchmarked 10 assemblers based on ONT raw reads from two published molluscan genomes of differing properties, the gastropod Chrysomallon squamiferum (356.6 Mb, 1.59% heterozygosity) and the bivalve Mytilus coruscus (1593 Mb, 1.94% heterozygosity). By optimizing the assembly pipeline, we greatly improved both genomes from previously published versions. Our results suggested that 40–50X of ONT reads are sufficient for high-quality genomes, with Flye being the recommended assembler for compact and less heterozygous genomes exemplified by C. squamiferum , while NextDenovo excelled for more repetitive and heterozygous molluscan genomes exemplified by M. coruscus . A phylogenomic analysis using the two updated genomes with 32 other published high-quality lophotrochozoan genomes resulted in maximum support across all nodes, and we show that improved genome quality also leads to more complete matrices for phylogenomic inferences. Our benchmarking will ensure efficiency in future assemblies for molluscs and perhaps also for other marine phyla with few genomes available. This article is part of the Theo Murphy meeting issue ‘Molluscan genomics: broad insights and future directions for a neglected phylum’. 
    more » « less
  3. Abstract Although plastid genome (plastome) structure is highly conserved across most seed plants, investigations during the past two decades have revealed several disparately related lineages that experienced substantial rearrangements. Most plastomes contain a large inverted repeat and two single‐copy regions, and a few dispersed repeats; however, the plastomes of some taxa harbour long repeat sequences (>300 bp). These long repeats make it challenging to assemble complete plastomes using short‐read data, leading to misassemblies and consensus sequences with spurious rearrangements. Single‐molecule, long‐read sequencing has the potential to overcome these challenges, yet there is no consensus on the most effective method for accurately assembling plastomes using long‐read data. We generated a pipeline,plastidGenomeAssemblyUsingLong‐read data (ptGAUL), to address the problem of plastome assembly using long‐read data from Oxford Nanopore Technologies (ONT) or Pacific Biosciences platforms. We demonstrated the efficacy of the ptGAUL pipeline using 16 published long‐read data sets. We showed that ptGAUL quickly produces accurate and unbiased assemblies using only ~50× coverage of plastome data. Additionally, we deployed ptGAUL to assemble four newJuncus(Juncaceae) plastomes using ONT long reads. Our results revealed many long repeats and rearrangements inJuncusplastomes compared with basal lineages of Poales. The ptGAUL pipeline is available on GitHub:https://github.com/Bean061/ptgaul. 
    more » « less
  4. Abstract Objective:Whole genome sequencing (WGS) can help identify transmission of pathogens causing healthcare-associated infections (HAIs). However, the current gold standard of short-read, Illumina-based WGS is labor and time intensive. Given recent improvements in long-read Oxford Nanopore Technologies (ONT) sequencing, we sought to establish a low resource approach providing accurate WGS-pathogen comparison within a time frame allowing for infection prevention and control (IPC) interventions. Methods:WGS was prospectively performed on pathogens at increased risk of potential healthcare transmission using the ONT MinION sequencer with R10.4.1 flow cells and Dorado basecaller. Potential transmission was assessed via Ridom SeqSphere+ for core genome multilocus sequence typing and MINTyper for reference-based core genome single nucleotide polymorphisms using previously published cutoff values. The accuracy of our ONT pipeline was determined relative to Illumina. Results:Over a six-month period, 242 bacterial isolates from 216 patients were sequenced by a single operator. Compared to the Illumina gold standard, our ONT pipeline achieved a mean identity score of Q60 for assembled genomes, even with a coverage rate as low as 40×. The mean time from initiating DNA extraction to complete analysis was 2 days (IQR 2–3.25 days). We identified five potential transmission clusters comprising 21 isolates (8.7% of sequenced strains). Integrating ONT with epidemiological data, >70% (15/21) of putative transmission cluster isolates originated from patients with potential healthcare transmission links. Conclusions:Via a stand-alone ONT pipeline, we detected potentially transmitted HAI pathogens rapidly and accurately, aligning closely with epidemiological data. Our low-resource method has the potential to assist in IPC efforts. 
    more » « less
  5. Abstract The symbiosis between clownfish and giant tropical sea anemones (Order Actiniaria) is one of the most iconic on the planet. Distributed on tropical reefs, 28 species of clownfishes form obligate mutualistic relationships with 10 nominal species of venomous sea anemones. Our understanding of the symbiosis is limited by the fact that most research has been focused on the clownfishes. Chromosome scale reference genomes are available for all clownfish species, yet there are no published reference genomes for the host sea anemones. Recent studies have shown that the clownfish-hosting sea anemones belong to three distinct clades of sea anemones that have evolved symbiosis with clownfishes independently. Here we present the first high quality long read assemblies for three species of clownfish hosting sea anemones belonging to each of these clades:Entacmaea quadricolor, Stichodactyla haddoni, Radianthus doreensis. PacBio HiFi sequencing yielded 1,597,562, 3,101,773, and 1,918,148 million reads forE. quadricolor, S. haddoni, andR. doreensis, respectively. All three assemblies were highly contiguous and complete with N50 values above 4Mb and BUSCO completeness above 95% on the Metazoa dataset. Genome structural annotation with BRAKER3 predicted 20,454, 18,948 and 17,056 protein coding genes inE. quadricolor, S. haddoniandR. doreeensisgenome, respectively. These new resources will form the basis of comparative genomic analyses that will allow us to deepen our understanding of this mutualism from the host perspective. SignificanceChromosome-scale genomes are available for all 28 clownfish species yet there are no high-quality reference genomes published for the clownfish-hosting sea anemones. The lack of genomic resources impedes our ability to understand evolution of this iconic symbiosis from the host perspective. The clownfish-hosting sea anemones belong to three clades of sea anemones that have evolved mutualism with clownfish independently. Here we assembled the first high-quality long-read genomes for three species of host sea anemones each belonging to a different host clade:Entacmaea quadricolor, Stichodactyla haddoni, Radianthus doreensis. These resources will enable in depth comparative genomics of clownfish-hosting sea anemones providing a critical perspective for understanding how the symbiosis has evolved. Finally, these reference genomes present a significant increase in the number of high-quality long-read genome assemblies for sea anemones (11 currently published) and double the number of high-quality reference genomes for the sea anemone superfamily Actinoidea. 
    more » « less