skip to main content

Title: Ultracontinuous Single Haplotype Genome Assemblies for the Domestic Cat ( Felis catus ) and Asian Leopard Cat ( Prionailurus bengalensis )
Abstract In addition to including one of the most popular companion animals, species from the cat family Felidae serve as a powerful system for genetic analysis of inherited and infectious disease, as well as for the study of phenotypic evolution and speciation. Previous diploid-based genome assemblies for the domestic cat have served as the primary reference for genomic studies within the cat family. However, these versions suffered from poor resolution of complex and highly repetitive regions, with substantial amounts of unplaced sequence that is polymorphic or copy number variable. We sequenced the genome of a female F1 Bengal hybrid cat, the offspring of a domestic cat (Felis catus) x Asian leopard cat (Prionailurus bengalensis) cross, with PacBio long sequence reads and used Illumina sequence reads from the parents to phase >99.9% of the reads into the 2 species’ haplotypes. De novo assembly of the phased reads produced highly continuous haploid genome assemblies for the domestic cat and Asian leopard cat, with contig N50 statistics exceeding 83 Mb for both genomes. Whole-genome alignments reveal the Felis and Prionailurus genomes are colinear, and the cytogenetic differences between the homologous F1 and E4 chromosomes represent a case of centromere repositioning in the absence more » of a chromosomal inversion. Both assemblies offer significant improvements over the previous domestic cat reference genome, with a 100% increase in contiguity and the capture of the vast majority of chromosome arms in 1 or 2 large contigs. We further demonstrated that comparably accurate F1 haplotype phasing can be achieved with members of the same species when one or both parents of the trio are not available. These novel genome resources will empower studies of feline precision medicine, adaptation, and speciation. « less
; ; ; ; ; ; ; ; ;
Shapiro, Beth
Award ID(s):
Publication Date:
Journal Name:
Journal of Heredity
Page Range or eLocation-ID:
165 to 173
Sponsoring Org:
National Science Foundation
More Like this
  1. Gojobori, Jun (Ed.)
    Abstract The sterility or inviability of hybrid offspring produced from an interspecific mating results from incompatibilities between parental genotypes that are thought to result from divergence of loci involved in epistatic interactions. However, attributes contributing to the rapid evolution of these regions also complicates their assembly, thus discovery of candidate hybrid sterility loci is difficult and has been restricted to a small number of model systems. Here we reported rapid interspecific divergence at the DXZ4 macrosatellite locus in an interspecific cross between two closely related mammalian species: the domestic cat (Felis silvestris catus) and the Jungle cat (Felis chaus). DXZ4 is an interesting candidate due to its structural complexity, copy number variability, and described role in the critical yet complex biological process of X-chromosome inactivation. However, the full structure of DXZ4 was absent or incomplete in nearly every available mammalian genome assembly given its repetitive complexity. We compared highly continuous genomes for three cat species, each containing a complete DXZ4 locus, and discovered that the felid DXZ4 locus differs substantially from the human ortholog, and that it varies in copy number between cat species. Additionally, we reported expression, methylation, and structural conformation profiles of DXZ4 and the X chromosome duringmore »stages of spermatogenesis that have been previously associated with hybrid male sterility. Collectively, these findings suggest a new role for DXZ4 in male meiosis and a proposed mechanism of feline interspecific incompatibility through rapid satellite divergence.« less
  2. Koepfli, Klaus-Peter (Ed.)
    Abstract Bison are an icon of the American West and an ecologically, commercially, and culturally important species. Despite numbering in the hundreds of thousands today, conservation concerns remain for the species, including the impact on genetic diversity of a severe bottleneck around the turn of the 20th century and genetic introgression from domestic cattle. Genetic diversity and admixture are best evaluated at genome-wide scale, for which a high-quality reference is necessary. Here, we use trio binning of long reads from a bison–Simmental cattle (Bos taurus taurus) male F1 hybrid to sequence and assemble the genome of the American plains bison (Bison bison bison). The male haplotype genome is chromosome-scale, with a total length of 2.65 Gb across 775 scaffolds (839 contigs) and a scaffold N50 of 87.8 Mb. Our bison genome is ~13× more contiguous overall and ~3400× more contiguous at the contig level than the current bison reference genome. The bison genome sequence presented here (ARS-UCSC_bison1.0) will enable new research into the evolutionary history of this iconic megafauna species and provide a new tool for the management of bison populations in federal and commercial herds.
  3. Abstract Current phylogenomic approaches implicitly assume that the predominant phylogenetic signal within a genome reflects the true evolutionary history of organisms, without assessing the confounding effects of postspeciation gene flow that can produce a mosaic of phylogenetic signals that interact with recombinational variation. Here, we tested the validity of this assumption with a phylogenomic analysis of 27 species of the cat family, assessing local effects of recombination rate on species tree inference and divergence time estimation across their genomes. We found that the prevailing phylogenetic signal within the autosomes is not always representative of the most probable speciation history, due to ancient hybridization throughout felid evolution. Instead, phylogenetic signal was concentrated within regions of low recombination, and notably enriched within large X chromosome recombination cold spots that exhibited recurrent patterns of strong genetic differentiation and selective sweeps across mammalian orders. By contrast, regions of high recombination were enriched for signatures of ancient gene flow, and these sequences inflated crown-lineage divergence times by ∼40%. We conclude that existing phylogenomic approaches to infer the Tree of Life may be highly misleading without considering the genomic architecture of phylogenetic signal relative to recombination rate and its interplay with historical hybridization.
  4. The blue crab, Callinectes sapidus (Rathbun, 1896) is an economically, culturally, and ecologically important species found across the temperate and tropical North and South American Atlantic coast. A reference genome will enable research for this high-value species. Initial assembly combined 200× coverage Illumina paired-end reads, a 60× 8 kb mate-paired library, and 50× PacBio data using the MaSuRCA assembler resulting in a 985 Mb assembly with a scaffold N50 of 153 kb. Dovetail Chicago and HiC sequencing with the 3d DNA assembler and Juicebox assembly tools were then used for chromosome scaffolding. The 50 largest scaffolds span 810 Mb are 1.5–37 Mb long and have a repeat content of 36%. The 190 Mb unplaced sequence is in 3921 sequences over 10 kb with a repeat content of 68%. The final assembly N50 is 18.9 Mb for scaffolds and 9317 bases for contigs. Of arthropod BUSCO, ∼88% (888/1013) were complete and single copies. Using 309 million RNAseq read pairs from 12 different tissues and developmental stages, 25,249 protein-coding genes were predicted. Between C. sapidus and Portunus trituberculatus genomes, 41 of 50 large scaffolds had high nucleotide identity and protein-coding synteny, but 9 scaffolds in both assemblies were not clear matches. The protein-coding genes included 9423 one-to-one putative orthologs, ofmore »which 7165 were syntenic between the two crab species. Overall, the two crab genome assemblies show strong similarities at the nucleotide, protein, and chromosome level and verify the blue crab genome as an excellent reference for this important seafood species.« less
  5. Slotte, Tanja (Ed.)
    Abstract Intracellular transfers of mitochondrial DNA continue to shape nuclear genomes. Chromosome 2 of the model plant Arabidopsis thaliana contains one of the largest known nuclear insertions of mitochondrial DNA (numts). Estimated at over 600 kb in size, this numt is larger than the entire Arabidopsis mitochondrial genome. The primary Arabidopsis nuclear reference genome contains less than half of the numt because of its structural complexity and repetitiveness. Recent data sets generated with improved long-read sequencing technologies (PacBio HiFi) provide an opportunity to finally determine the accurate sequence and structure of this numt. We performed a de novo assembly using sequencing data from recent initiatives to span the Arabidopsis centromeres, producing a gap-free sequence of the Chromosome 2 numt, which is 641 kb in length and has 99.933% nucleotide sequence identity with the actual mitochondrial genome. The numt assembly is consistent with the repetitive structure previously predicted from fiber-based fluorescent in situ hybridization. Nanopore sequencing data indicate that the numt has high levels of cytosine methylation, helping to explain its biased spectrum of nucleotide sequence divergence and supporting previous inferences that it is transcriptionally inactive. The original numt insertion appears to have involved multiple mitochondrial DNA copies with alternative structures that subsequentlymore »underwent an additional duplication event within the nuclear genome. This work provides insights into numt evolution, addresses one of the last unresolved regions of the Arabidopsis reference genome, and represents a resource for distinguishing between highly similar numt and mitochondrial sequences in studies of transcription, epigenetic modifications, and de novo mutations.« less