skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: A Chromosome-Length Reference Genome for the Endangered Pacific Pocket Mouse Reveals Recent Inbreeding in a Historically Large Population
Abstract High-quality reference genomes are fundamental tools for understanding population history, and can provide estimates of genetic and demographic parameters relevant to the conservation of biodiversity. The federally endangered Pacific pocket mouse (PPM), which persists in three small, isolated populations in southern California, is a promising model for studying how demographic history shapes genetic diversity, and how diversity in turn may influence extinction risk. To facilitate these studies in PPM, we combined PacBio HiFi long reads with Omni-C and Hi-C data to generate a de novo genome assembly, and annotated the genome using RNAseq. The assembly comprised 28 chromosome-length scaffolds (N50 = 72.6 MB) and the complete mitochondrial genome, and included a long heterochromatic region on chromosome 18 not represented in the previously available short-read assembly. Heterozygosity was highly variable across the genome of the reference individual, with 18% of windows falling in runs of homozygosity (ROH) >1 MB, and nearly 9% in tracts spanning >5 MB. Yet outside of ROH, heterozygosity was relatively high (0.0027), and historical Ne estimates were large. These patterns of genetic variation suggest recent inbreeding in a formerly large population. Currently the most contiguous assembly for a heteromyid rodent, this reference genome provides insight into the past and recent demographic history of the population, and will be a critical tool for management and future studies of outbreeding depression, inbreeding depression, and genetic load.  more » « less
Award ID(s):
2019745
PAR ID:
10417188
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ; ; ; ; ; ;
Editor(s):
Enard, David
Date Published:
Journal Name:
Genome Biology and Evolution
Volume:
14
Issue:
8
ISSN:
1759-6653
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Lohmueller, Kirk (Ed.)
    Abstract The levels and distribution of standing genetic variation in a genome can provide a wealth of insights about the adaptive potential, demographic history, and genome structure of a population or species. As structural variants are increasingly associated with traits important for adaptation and speciation, investigating both sequence and structural variation is essential for wholly tapping this potential. Using a combination of shotgun sequencing, 10x Genomics linked reads and proximity-ligation data (Chicago and Hi-C), we produced and annotated a chromosome-level genome assembly for the Atlantic silverside (Menidia menidia)—an established ecological model for studying the phenotypic effects of natural and artificial selection—and examined patterns of genomic variation across two individuals sampled from different populations with divergent local adaptations. Levels of diversity varied substantially across each chromosome, consistently being highly elevated near the ends (presumably near telomeric regions) and dipping to near zero around putative centromeres. Overall, our estimate of the genome-wide average heterozygosity in the Atlantic silverside is among the highest reported for a fish, or any vertebrate (1.32–1.76% depending on inference method and sample). Furthermore, we also found extreme levels of structural variation, affecting ∼23% of the total genome sequence, including multiple large inversions (> 1 Mb and up to 12.6 Mb) associated with previously identified haploblocks showing strong differentiation between locally adapted populations. These extreme levels of standing genetic variation are likely associated with large effective population sizes and may help explain the remarkable adaptive divergence among populations of the Atlantic silverside. 
    more » « less
  2. Abstract Hares (genus Lepus) provide clear examples of repeated and often massive introgressive hybridization and striking local adaptations. Genomic studies on this group have so far relied on comparisons to the European rabbit (Oryctolagus cuniculus) reference genome. Here, we report the first de novo draft reference genome for a hare species, the mountain hare (Lepus timidus), and evaluate the efficacy of whole-genome re-sequencing analyses using the new reference versus using the rabbit reference genome. The genome was assembled using the ALLPATHS-LG protocol with a combination of overlapping pair and mate-pair Illumina sequencing (77x coverage). The assembly contained 32,294 scaffolds with a total length of 2.7 Gb and a scaffold N50 of 3.4 Mb. Re-scaffolding based on the rabbit reference reduced the total number of scaffolds to 4,205 with a scaffold N50 of 194 Mb. A correspondence was found between 22 of these hare scaffolds and the rabbit chromosomes, based on gene content and direct alignment. We annotated 24,578 protein coding genes by combining ab-initio predictions, homology search, and transcriptome data, of which 683 were solely derived from hare-specific transcriptome data. The hare reference genome is therefore a new resource to discover and investigate hare-specific variation. Similar estimates of heterozygosity and inferred demographic history profiles were obtained when mapping hare whole-genome re-sequencing data to the new hare draft genome or to alternative references based on the rabbit genome. Our results validate previous reference-based strategies and suggest that the chromosome-scale hare draft genome should enable chromosome-wide analyses and genome scans on hares. 
    more » « less
  3. Abstract Flow cytometry estimates of genome sizes among species of Drosophila show a 3-fold variation, ranging from ∼127 Mb in Drosophila mercatorum to ∼400 Mb in Drosophila cyrtoloma. However, the assembled portion of the Muller F element (orthologous to the fourth chromosome in Drosophila melanogaster) shows a nearly 14-fold variation in size, ranging from ∼1.3 Mb to >18 Mb. Here, we present chromosome-level long-read genome assemblies for 4 Drosophila species with expanded F elements ranging in size from 2.3 to 20.5 Mb. Each Muller element is present as a single scaffold in each assembly. These assemblies will enable new insights into the evolutionary causes and consequences of chromosome size expansion. 
    more » « less
  4. Genome assembly can be challenging for species that are characterized by high amounts of polymorphism, heterozygosity, and large effective population sizes. High levels of heterozygosity can result in genome mis-assemblies and a larger than expected genome size due to the haplotig versions of a single locus being assembled as separate loci. Here, we describe the first chromosome-level genome for the eastern oyster, Crassostrea virginica. Publicly released and annotated in 2017, the assembly has a scaffold N50 of 54 mb and is over 97.3% complete based on BUSCO analysis. The genome assembly for the eastern oyster is a critical resource for foundational research into molluscan adaptation to a changing environment and for selective breeding for the aquaculture industry. Subsequent resequencing data suggested the presence of haplotigs in the original assembly, and we developed a post hoc method to break up chimeric contigs and mask haplotigs in published heterozygous genomes and evaluated improvements to the accuracy of downstream analysis. Masking haplotigs had a large impact on SNP discovery and estimates of nucleotide diversity and had more subtle and nuanced effects on estimates of heterozygosity, population structure analysis, and outlier detection. We show that haplotig masking can be a powerful tool for improving genomic inference, and we present an open, reproducible resource for the masking of haplotigs in any published genome. 
    more » « less
  5. Abstract Understanding the genetic and fitness consequences of anthropogenic bottlenecks is crucial for biodiversity conservation. However, studies of bottlenecked populations combining genomic approaches with fitness data are rare. Theory predicts that severe bottlenecks deplete genetic diversity, exacerbate inbreeding depression and decrease population viability. However, actual outcomes are complex and depend on how a species’ unique demography affects its genetic load. We used population genetic and veterinary pathology data, demographic modelling, whole-genome resequencing and forward genetic simulations to investigate the genomic and fitness consequences of a near-extinction event in the northern elephant seal. We found no evidence of inbreeding depression within the contemporary population for key fitness components, including body mass, blubber thickness and susceptibility to parasites and disease. However, we detected a genomic signature of a recent extreme bottleneck (effective population size = 6; 95% confidence interval = 5.0–7.5) that will have purged much of the genetic load, potentially leading to the lack of observed inbreeding depression in our study. Our results further suggest that deleterious genetic variation strongly impacted the post-bottleneck population dynamics of the northern elephant seal. Our study provides comprehensive empirical insights into the intricate dynamics underlying species-specific responses to anthropogenic bottlenecks. 
    more » « less