Title: Toward telomere to telomere cat genomes for precision medicine and conservation biology
Genomic data from species of the cat family Felidae promise to stimulate veterinary and human medical advances, and clarify the coherence of genome organization. We describe how interspecies hybrids have been instrumental in the genetic analysis of cats, from the first genetic maps to propelling cat genomes toward the T2T standard set by the human genome project. Genotype-to-phenotype mapping in cat models has revealed dozens of health-related genetic variants, the molecular basis for mammalian pigmentation and patterning, and species-specific adaptations. Improved genomic surveillance of natural and captive populations across the cat family tree will increase our understanding of the genetic architecture of traits, population dynamics, and guide a future of genome-enabled biodiversity conservation. more »« less
Bredemeyer, Kevin R; Harris, Andrew J; Li, Gang; Zhao, Le; Foley, Nicole M; Roelke-Parker, Melody; O’Brien, Stephen J; Lyons, Leslie A; Warren, Wesley C; Murphy, William J
(, Journal of Heredity)
Shapiro, Beth
(Ed.)
Abstract In addition to including one of the most popular companion animals, species from the cat family Felidae serve as a powerful system for genetic analysis of inherited and infectious disease, as well as for the study of phenotypic evolution and speciation. Previous diploid-based genome assemblies for the domestic cat have served as the primary reference for genomic studies within the cat family. However, these versions suffered from poor resolution of complex and highly repetitive regions, with substantial amounts of unplaced sequence that is polymorphic or copy number variable. We sequenced the genome of a female F1 Bengal hybrid cat, the offspring of a domestic cat (Felis catus) x Asian leopard cat (Prionailurus bengalensis) cross, with PacBio long sequence reads and used Illumina sequence reads from the parents to phase >99.9% of the reads into the 2 species’ haplotypes. De novo assembly of the phased reads produced highly continuous haploid genome assemblies for the domestic cat and Asian leopard cat, with contig N50 statistics exceeding 83 Mb for both genomes. Whole-genome alignments reveal the Felis and Prionailurus genomes are colinear, and the cytogenetic differences between the homologous F1 and E4 chromosomes represent a case of centromere repositioning in the absence of a chromosomal inversion. Both assemblies offer significant improvements over the previous domestic cat reference genome, with a 100% increase in contiguity and the capture of the vast majority of chromosome arms in 1 or 2 large contigs. We further demonstrated that comparably accurate F1 haplotype phasing can be achieved with members of the same species when one or both parents of the trio are not available. These novel genome resources will empower studies of feline precision medicine, adaptation, and speciation.
Hoyt, Savannah J.; Storer, Jessica M.; Hartley, Gabrielle A.; Grady, Patrick G.; Gershman, Ariel; de Lima, Leonardo G.; Limouse, Charles; Halabian, Reza; Wojenski, Luke; Rodriguez, Matias; et al
(, Science)
INTRODUCTION Transposable elements (TEs), repeat expansions, and repeat-mediated structural rearrangements play key roles in chromosome structure and species evolution, contribute to human genetic variation, and substantially influence human health through copy number variants, structural variants, insertions, deletions, and alterations to gene transcription and splicing. Despite their formative role in genome stability, repetitive regions have been relegated to gaps and collapsed regions in human genome reference GRCh38 owing to the technological limitations during its development. The lack of linear sequence in these regions, particularly in centromeres, resulted in the inability to fully explore the repeat content of the human genome in the context of both local and regional chromosomal environments. RATIONALE Long-read sequencing supported the complete, telomere-to-telomere (T2T) assembly of the pseudo-haploid human cell line CHM13. This resource affords a genome-scale assessment of all human repetitive sequences, including TEs and previously unknown repeats and satellites, both within and outside of gaps and collapsed regions. Additionally, a complete genome enables the opportunity to explore the epigenetic and transcriptional profiles of these elements that are fundamental to our understanding of chromosome structure, function, and evolution. Comparative analyses reveal modes of repeat divergence, evolution, and expansion or contraction with locus-level resolution. RESULTS We implemented a comprehensive repeat annotation workflow using previously known human repeats and de novo repeat modeling followed by manual curation, including assessing overlaps with gene annotations, segmental duplications, tandem repeats, and annotated repeats. Using this method, we developed an updated catalog of human repetitive sequences and refined previous repeat annotations. We discovered 43 previously unknown repeats and repeat variants and characterized 19 complex, composite repetitive structures, which often carry genes, across T2T-CHM13. Using precision nuclear run-on sequencing (PRO-seq) and CpG methylated sites generated from Oxford Nanopore Technologies long-read sequencing data, we assessed RNA polymerase engagement across retroelements genome-wide, revealing correlations between nascent transcription, sequence divergence, CpG density, and methylation. These analyses were extended to evaluate RNA polymerase occupancy for all repeats, including high-density satellite repeats that reside in previously inaccessible centromeric regions of all human chromosomes. Moreover, using both mapping-dependent and mapping-independent approaches across early developmental stages and a complete cell cycle time series, we found that engaged RNA polymerase across satellites is low; in contrast, TE transcription is abundant and serves as a boundary for changes in CpG methylation and centromere substructure. Together, these data reveal the dynamic relationship between transcriptionally active retroelement subclasses and DNA methylation, as well as potential mechanisms for the derivation and evolution of new repeat families and composite elements. Focusing on the emerging T2T-level assembly of the HG002 X chromosome, we reveal that a high level of repeat variation likely exists across the human population, including composite element copy numbers that affect gene copy number. Additionally, we highlight the impact of repeats on the structural diversity of the genome, revealing repeat expansions with extreme copy number differences between humans and primates while also providing high-confidence annotations of retroelement transduction events. CONCLUSION The comprehensive repeat annotations and updated repeat models described herein serve as a resource for expanding the compendium of human genome sequences and reveal the impact of specific repeats on the human genome. In developing this resource, we provide a methodological framework for assessing repeat variation within and between human genomes. The exhaustive assessment of the transcriptional landscape of repeats, at both the genome scale and locally, such as within centromeres, sets the stage for functional studies to disentangle the role transcription plays in the mechanisms essential for genome stability and chromosome segregation. Finally, our work demonstrates the need to increase efforts toward achieving T2T-level assemblies for nonhuman primates and other species to fully understand the complexity and impact of repeat-derived genomic innovations that define primate lineages, including humans. Telomere-to-telomere assembly of CHM13 supports repeat annotations and discoveries. The human reference T2T-CHM13 filled gaps and corrected collapsed regions (triangles) in GRCh38. Combining long read–based methylation calls, PRO-seq, and multilevel computational methods, we provide a compendium of human repeats, define retroelement expression and methylation profiles, and delineate locus-specific sites of nascent transcription genome-wide, including previously inaccessible centromeres. SINE, short interspersed element; SVA, SINE–variable number tandem repeat– Alu ; LINE, long interspersed element; LTR, long terminal repeat; TSS, transcription start site; pA, xxxxxxxxxxxxxxxx.
Abstract The spatiotemporal organization of DNA replication produces a highly robust and reproducible replication timing profile. Sequencing-based methods for assaying replication timing genome-wide have become commonplace, but regions of high repeat content in the human genome have remained refractory to analysis. Here, we report the first nearly-gapless telomere-to-telomere replication timing profiles in human, using the T2T-CHM13 genome assembly and sequencing data for five cell lines. We find that replication timing can be successfully assayed in centromeres and large blocks of heterochromatin. Centromeric regions replicate in mid-to-late S-phase and contain replication-timing peaks at a similar density to other genomic regions, while distinct families of heterochromatic satellite DNA differ in their bias for replicating in late S-phase. The high degree of consistency in centromeric replication timing across chromosomes within each cell line prompts further investigation into the mechanisms dictating that some cell lines replicate their centromeres earlier than others, and what the consequences of this variation are.
Yuan, Jiaqing; Kitchener, Andrew C.; Lackey, Laurie Bingaman; Sun, Ting; Jiangzuo, Qigao; Tuohetahong, Yilamujiang; Zhao, Le; Yang, Peng; Wang, Guiqiang; Huang, Chen; et al
(, Proceedings of the National Academy of Sciences)
Habitat degradation and loss of genetic diversity are common threats faced by almost all of today’s wild cats. Big cats, such as tigers and lions, are of great concern and have received considerable conservation attention through policies and international actions. However, knowledge of and conservation actions for small wild cats are lagging considerably behind. The black-footed cat,Felis nigripes, one of the smallest felid species, is experiencing increasing threats with a rapid reduction in population size. However, there is a lack of genetic information to assist in developing effective conservation actions. A de novo assembly of a high-quality chromosome-level reference genome of the black-footed cat was made, and comparative genomics and population genomics analyses were carried out. These analyses revealed that the most significant genetic changes in the evolution of the black-footed cat are the rapid evolution of sensory and metabolic-related genes, reflecting genetic adaptations to its characteristic nocturnal hunting and a high metabolic rate. Genomes of the black-footed cat exhibit a high level of inbreeding, especially for signals of recent inbreeding events, which suggest that they may have experienced severe genetic isolation caused by habitat fragmentation. More importantly, inbreeding associated with two deleterious mutated genes may exacerbate the risk of amyloidosis, the dominant disease that causes mortality of about 70% of captive individuals. Our research provides comprehensive documentation of the evolutionary history of the black-footed cat and suggests that there is an urgent need to investigate genomic variations of small felids worldwide to support effective conservation actions.
Lescroart, Jonas; Bonilla-Sánchez, Alejandra; Napolitano, Constanza; Buitrago-Torres, Diana L.; Ramírez-Chaves, Héctor E.; Pulido-Santacruz, Paola; Murphy, William J.; Svardal, Hannes; Eizirik, Eduardo; O'Connell, ed., Mary
(, Molecular Biology and Evolution)
Abstract Even in the genomics era, the phylogeny of Neotropical small felids comprised in the genus Leopardus remains contentious. We used whole-genome resequencing data to construct a time-calibrated consensus phylogeny of this group, quantify phylogenomic discordance, test for interspecies introgression, and assess patterns of genetic diversity and demographic history. We infer that the Leopardus radiation started in the Early Pliocene as an initial speciation burst, followed by another in its subgenus Oncifelis during the Early Pleistocene. Our findings challenge the long-held notion that ocelot (Leopardus pardalis) and margay (L. wiedii) are sister species and instead indicate that margay is most closely related to the enigmatic Andean cat (L. jacobita), whose whole-genome data are reported here for the first time. In addition, we found that the newly sampled Andean tiger cat (L. tigrinus pardinoides) population from Colombia associates closely with Central American tiger cats (L. tigrinus oncilla). Genealogical discordance was largely attributable to incomplete lineage sorting, yet was augmented by strong gene flow between ocelot and the ancestral branch of Oncifelis, as well as between Geoffroy's cat (L. geoffroyi) and southern tiger cat (L. guttulus). Contrasting demographic trajectories have led to disparate levels of current genomic diversity, with a nearly tenfold difference in heterozygosity between Andean cat and ocelot, spanning the entire range of variability found in extant felids. Our analyses improved our understanding of the speciation history and diversity patterns in this felid radiation, and highlight the benefits to phylogenomic inference of embracing the many heterogeneous signals scattered across the genome.
Murphy, William J, and Harris, Andrew J. Toward telomere to telomere cat genomes for precision medicine and conservation biology. Retrieved from https://par.nsf.gov/biblio/10518424. Genome Research 34.5 Web. doi:10.1101/gr.278546.123.
Murphy, William J, & Harris, Andrew J. Toward telomere to telomere cat genomes for precision medicine and conservation biology. Genome Research, 34 (5). Retrieved from https://par.nsf.gov/biblio/10518424. https://doi.org/10.1101/gr.278546.123
Murphy, William J, and Harris, Andrew J.
"Toward telomere to telomere cat genomes for precision medicine and conservation biology". Genome Research 34 (5). Country unknown/Code not available: Cold Spring Harbor Laboratory Press. https://doi.org/10.1101/gr.278546.123.https://par.nsf.gov/biblio/10518424.
@article{osti_10518424,
place = {Country unknown/Code not available},
title = {Toward telomere to telomere cat genomes for precision medicine and conservation biology},
url = {https://par.nsf.gov/biblio/10518424},
DOI = {10.1101/gr.278546.123},
abstractNote = {Genomic data from species of the cat family Felidae promise to stimulate veterinary and human medical advances, and clarify the coherence of genome organization. We describe how interspecies hybrids have been instrumental in the genetic analysis of cats, from the first genetic maps to propelling cat genomes toward the T2T standard set by the human genome project. Genotype-to-phenotype mapping in cat models has revealed dozens of health-related genetic variants, the molecular basis for mammalian pigmentation and patterning, and species-specific adaptations. Improved genomic surveillance of natural and captive populations across the cat family tree will increase our understanding of the genetic architecture of traits, population dynamics, and guide a future of genome-enabled biodiversity conservation.},
journal = {Genome Research},
volume = {34},
number = {5},
publisher = {Cold Spring Harbor Laboratory Press},
author = {Murphy, William J and Harris, Andrew J},
}
Warning: Leaving National Science Foundation Website
You are now leaving the National Science Foundation website to go to a non-government website.
Website:
NSF takes no responsibility for and exercises no control over the views expressed or the accuracy of
the information contained on this site. Also be aware that NSF's privacy policy does not apply to this site.