skip to main content

Title: A Beary Good Genome: Haplotype-Resolved, Chromosome-Level Assembly of the Brown Bear ( Ursus arctos )

The brown bear (Ursus arctos) is the second largest and most widespread extant terrestrial carnivore on Earth and has recently emerged as a medical model for human metabolic diseases. Here, we report a fully phased chromosome-level assembly of a male North American brown bear built by combining Pacific Biosciences (PacBio) HiFi data and publicly available Hi-C data. The final genome size is 2.47 Gigabases (Gb) with a scaffold and contig N50 length of 70.08 and 43.94 Megabases (Mb), respectively. Benchmarking Universal Single-Copy Ortholog (BUSCO) analysis revealed that 94.5% of single copy orthologs from Mammalia were present in the genome (the highest of any ursid genome to date). Repetitive elements accounted for 44.48% of the genome and a total of 20,480 protein coding genes were identified. Based on whole genome alignment to the polar bear, the brown bear is highly syntenic with the polar bear, and our phylogenetic analysis of 7,246 single-copy orthologs supports the currently proposed species tree for Ursidae. This highly contiguous genome assembly will support future research on both the evolutionary history of the bear family and the physiological mechanisms behind hibernation, the latter of which has broad medical implications.

; ; ; ; ; ; ; ;
Award ID(s):
Publication Date:
Journal Name:
Genome Biology and Evolution
Oxford University Press
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Vitis riparia, a critically important Native American grapevine species, is used globally in rootstock and scion breeding and contributed to the recovery of the French wine industry during the mid-19th century phylloxera epidemic. This species has abiotic and biotic stress tolerance and the largest natural geographic distribution of the North American grapevine species. Here we report an Illumina short-read 369X coverage, draft de novo heterozygous genome sequence ofV. ripariaMichx. ‘Manitoba 37’ with the size of ~495 Mb for 69,616 scaffolds and a N50 length of 518,740 bp. Using RNAseq data, 40,019 coding sequences were predicted and annotated. Benchmarking with Universal Single-Copy Orthologs (BUSCO) analysis of predicted gene models found 96% of the complete BUSCOs in this assembly. The assembly continuity and completeness were further validated usingV. ripariaESTs, BACs, and three de novo transcriptome assemblies of three differentV. ripariagenotypes resulting in >98% of respective sequences/transcripts mapping with this assembly. Alignment of theV. ripariaassembly and predicted CDS with the latestV. vinifera‘PN40024’ CDS and genome assembly showed 99% CDS alignment and a high degree of synteny. An analysis of plant transcription factors indicates a high degree of homology with theV. viniferatranscription factors. QTL mapping toV. riparia‘Manitoba 37’ andV. viniferaPN40024 has identified genetic relationships tomore »phenotypic variation between species. This assembly provides reference sequences, gene models for marker development and understandingV. riparia’s genetic contributions in grape breeding and research.

    « less
  2. Abstract Objectives

    Petrea volubilis, a member of the Order Lamiales and the Verbenaceae family, is an important horticultural species that has been used in traditional folk medicine. To provide a genome sequence for comparative studies within the Order Lamiales that includes important families such as Lamiaceae (mints), we generated a long-read, chromosome-scale genome assembly of this species.

    Data description

    Using a total of 45.5 Gb of Pacific Biosciences long read sequence, we generated a 480.2 Mb assembly ofP. volubilis,of which, 93% is chromosome anchored. Representation of genic regions was robust with 96.6% of the Benchmarking of Universal Single Copy Orthologs present in the genome assembly. A total of 57.8% of the genome was annotated as a repetitive sequence. Using a gene annotation pipeline that included refinement of gene models using transcript evidence, 30,982 high confidence genes were annotated. Access to theP. volubilisgenome will facilitate evolutionary studies in the Lamiales, a key order of Asterids that includes significant crop and medicinal plant species.

  3. Abstract

    The plant genus Bidens (Asteraceae or Compositae; Coreopsidae) is a species-rich and circumglobally distributed taxon. The 19 hexaploid species endemic to the Hawaiian Islands are considered an iconic example of adaptive radiation, of which many are imperiled and of high conservation concern. Until now, no genomic resources were available for this genus, which may serve as a model system for understanding the evolutionary genomics of explosive plant diversification. Here, we present a high-quality reference genome for the Hawaiʻi Island endemic species B. hawaiensis A. Gray reconstructed from long-read, high-fidelity sequences generated on a Pacific Biosciences Sequel II System. The haplotype-aware, draft genome assembly consisted of ~6.67 Giga bases (Gb), close to the holoploid genome size estimate of 7.56 Gb (±0.44 SD) determined by flow cytometry. After removal of alternate haplotigs and contaminant filtering, the consensus haploid reference genome was comprised of 15 904 contigs containing ~3.48 Gb, with a contig N50 value of 422 594. The high interspersed repeat content of the genome, approximately 74%, along with hexaploid status, contributed to assembly fragmentation. Both the haplotype-aware and consensus haploid assemblies recovered >96% of Benchmarking Universal Single-Copy Orthologs. Yet, the removal of alternate haplotigs did not substantially reduce the proportion of duplicatedmore »benchmarking genes (~79% vs. ~68%). This reference genome will support future work on the speciation process during adaptive radiation, including resolving evolutionary relationships, determining the genomic basis of trait evolution, and supporting ongoing conservation efforts.

    « less
  4. Abstract

    Certain cultivars of maize show increased tolerance to water deficit conditions by maintenance of root growth. To better understand the molecular mechanisms related to this adaptation, nodal root growth zone samples were collected from the reference inbred line B73 and inbred line FR697, which exhibits a relatively greater ability to maintain root elongation under water deficits. Plants were grown under various water stress levels in both field and controlled environment settings. FR697-specific RNA-Seq datasets were generated and used for a de novo transcriptome assembly to characterize any genotype-specific genetic features. The assembly was aided by an Iso-Seq library of transcripts generated from various FR697 plant tissue samples. The Necklace pipeline was used to combine a Trinity de novo assembly along with a reference guided assembly and the Viridiplantae proteome to generate an annotated consensus “SuperTranscriptome” assembly of 47,915 transcripts with a N50 of 3152 bp in length. The results were compared by Blastn to maize reference genes, a Benchmarking Universal Single-Copy Orthologs (BUSCO) genome completeness report and compared with three maize reference genomes. The resultant ‘SuperTranscriptome’ was demonstrated to be of high-quality and will serve as an important reference for analysis of the maize nodal root transcriptomic response to environmentalmore »perturbations.

    « less
  5. Lavrov, Dennis (Ed.)
    Abstract The painted lady butterfly, Vanessa cardui, has the longest migration routes, the widest hostplant diversity, and one of the most complex wing patterns of any insect. Due to minimal culturing requirements, easily characterized wing pattern elements, and technical feasibility of CRISPR/Cas9 genome editing, V. cardui is emerging as a functional genomics model for diverse research programs. Here, we report a high-quality, annotated genome assembly of the V. cardui genome, generated using 84× coverage of PacBio long-read data, which we assembled into 205 contigs with a total length of 425.4 Mb (N50 = 10.3 Mb). The genome was very complete (single-copy complete Benchmarking Universal Single-Copy Orthologs [BUSCO] 97%), with contigs assembled into presumptive chromosomes using synteny analyses. Our annotation used embryonic, larval, and pupal transcriptomes, and 20 transcriptomes across five different wing developmental stages. Gene annotations showed a high level of accuracy and completeness, with 14,437 predicted protein-coding genes. This annotated genome assembly constitutes an important resource for diverse functional genomic studies ranging from the developmental genetic basis of butterfly color pattern, to coevolution with diverse hostplants.