skip to main content

Title: A chromosome-level genome assembly and annotation of the desert horned lizard, Phrynosoma platyrhinos , provides insight into chromosomal rearrangements among reptiles
Abstract Background

The increasing number of chromosome-level genome assemblies has advanced our knowledge and understanding of macroevolutionary processes. Here, we introduce the genome of the desert horned lizard, Phrynosoma platyrhinos, an iguanid lizard occupying extreme desert conditions of the American southwest. We conduct analysis of the chromosomal structure and composition of this species and compare these features across genomes of 12 other reptiles (5 species of lizards, 3 snakes, 3 turtles, and 1 bird).


The desert horned lizard genome was sequenced using Illumina paired-end reads and assembled and scaffolded using Dovetail Genomics Hi-C and Chicago long-range contact data. The resulting genome assembly has a total length of 1,901.85 Mb, scaffold N50 length of 273.213 Mb, and includes 5,294 scaffolds. The chromosome-level assembly is composed of 6 macrochromosomes and 11 microchromosomes. A total of 20,764 genes were annotated in the assembly. GC content and gene density are higher for microchromosomes than macrochromosomes, while repeat element distributions show the opposite trend. Pathway analyses provide preliminary evidence that microchromosome and macrochromosome gene content are functionally distinct. Synteny analysis indicates that large microchromosome blocks are conserved among closely related species, whereas macrochromosomes show evidence of frequent fusion and fission events among reptiles, even between closely more » related species.


Our results demonstrate dynamic karyotypic evolution across Reptilia, with frequent inferred splits, fusions, and rearrangements that have resulted in shuffling of chromosomal blocks between macrochromosomes and microchromosomes. Our analyses also provide new evidence for distinct gene content and chromosomal structure between microchromosomes and macrochromosomes within reptiles.

« less
 ;  ;  ;  ;  ;  ;  
Award ID(s):
Publication Date:
Journal Name:
Oxford University Press
Sponsoring Org:
National Science Foundation
More Like this
  1. Sethuraman, A (Ed.)
    Abstract Spiny lizards in the genus Sceloporus are a model system among squamate reptiles for studies of chromosomal evolution. While most pleurodont iguanians retain an ancestral karyotype formula of 2n = 36 chromosomes, Sceloporus exhibits substantial karyotype variation ranging from 2n =  22 to 46 chromosomes. We present two annotated chromosome-scale genome assemblies for the Plateau Fence Lizard (Sceloporus tristichus) to facilitate research on the role of pericentric inversion polymorphisms on adaptation and speciation. Based on previous karyotype work using conventional staining, the S. tristichus genome is characterized as 2n =  22 with six pairs of macrochromosomes and five pairs of microchromosomes and a pericentric inversion polymorphism on chromosome 7 that is geographically variable. We provide annotated, chromosome-scale genomes for two lizards located at opposite ends of a dynamic hybrid zone that are each fixed for different inversion polymorphisms. The assembled genomes are 1.84–1.87 Gb (1.72 Gb for scaffolds mapping to chromosomes) with a scaffold N50 of 267.5 Mb. Functional annotation of the genomes resulted in ∼15K predicted gene models. Our assemblies confirmed the presence of a 4.62-Mb pericentric inversion on chromosome 7, which contains 62 annotated coding genes with known functions. In addition, we collected population genomics data using double digest RAD-sequencing for 44 S. tristichus to estimatemore »population structure and phylogeny across the Colorado Plateau. These new genomic resources provide opportunities to perform genomic scans and investigate the formation and spread of pericentric inversions in a naturally occurring hybrid zone.« less
  2. Abstract Lytechinus variegatus is a camarodont sea urchin found widely throughout the western Atlantic Ocean in a variety of shallow-water marine habitats. Its distribution, abundance, and amenability to developmental perturbation make it a popular model for ecologists and developmental biologists. Here, we present a chromosomal-level genome assembly of L. variegatus generated from a combination of PacBio long reads, 10× Genomics sequencing, and HiC chromatin interaction sequencing. We show L. variegatus has 19 chromosomes with an assembly size of 870.4 Mb. The contiguity and completeness of this assembly are reflected by a scaffold length N50 of 45.5 Mb and BUSCO completeness score of 95.5%. Ab initio and transcript-informed gene modeling and annotation identified 27,232 genes with an average gene length of 12.6 kb, comprising an estimated 39.5% of the genome. Repetitive regions, on the other hand, make up 45.4% of the genome. Physical mapping of well-studied developmental genes onto each chromosome reveals nonrandom spatial distribution of distinct genes and gene families, which provides insight into how certain gene families may have evolved and are transcriptionally regulated in this species. Lastly, aligning RNA-seq and ATAC-seq data onto this assembly demonstrates the value of highly contiguous, complete genome assemblies for functional genomics analyses that is unattainable with fragmented,more »incomplete assemblies. This genome will be of great value to the scientific community as a resource for genome evolution, developmental, and ecological studies of this species and the Echinodermata.« less
  3. Abstract Background

    The blue catfish is of great value in aquaculture and recreational fisheries. The F1 hybrids of female channel catfish (Ictalurus punctatus) × male blue catfish (Ictalurusfurcatus) have been the primary driver of US catfish production in recent years because of superior growth, survival, and carcass yield. The channel–blue hybrid also provides an excellent model to investigate molecular mechanisms of environment-dependent heterosis. However, transcriptome and methylome studies suffered from low alignment rates to the channel catfish genome due to divergence, and the genome resources for blue catfish are not publicly available.


    The blue catfish genome assembly is 841.86 Mbp in length with excellent continuity (8.6 Mbp contig N50, 28.2 Mbp scaffold N50) and completeness (98.6% Eukaryota and 97.0% Actinopterygii BUSCO). A total of 30,971 protein-coding genes were predicted, of which 21,781 were supported by RNA sequencing evidence. Phylogenomic analyses revealed that it diverged from channel catfish approximately 9 million years ago with 15.7 million fixed nucleotide differences. The within-species single-nucleotide polymorphism (SNP) density is 0.32% between the most aquaculturally important blue catfish strains (D&B and Rio Grande). Gene family analysis discovered significant expansion of immune-related families in the blue catfish lineage, which may contribute to disease resistance in blue catfish.

    more »Conclusions

    We reported the first high-quality, chromosome-level assembly of the blue catfish genome, which provides the necessary genomic tool kit for transcriptome and methylome analysis, SNP discovery and marker-assisted selection, gene editing and genome engineering, and reproductive enhancement of the blue catfish and hybrid catfish.

    « less
  4. Abstract Background

    The Aldabra giant tortoise (Aldabrachelys gigantea) is one of only two giant tortoise species left in the world. The species is endemic to Aldabra Atoll in Seychelles and is listed as Vulnerable on the International Union for Conservation of Nature Red List (v2.3) due to its limited distribution and threats posed by climate change. Genomic resources for A. gigantea are lacking, hampering conservation efforts for both wild and ex situpopulations. A high-quality genome would also open avenues to investigate the genetic basis of the species’ exceptionally long life span.


    We produced the first chromosome-level de novo genome assembly of A. gigantea using PacBio High-Fidelity sequencing and high-throughput chromosome conformation capture. We produced a 2.37-Gbp assembly with a scaffold N50 of 148.6 Mbp and a resolution into 26 chromosomes. RNA sequencing–assisted gene model prediction identified 23,953 protein-coding genes and 1.1 Gbp of repetitive sequences. Synteny analyses among turtle genomes revealed high levels of chromosomal collinearity even among distantly related taxa. To assess the utility of the high-quality assembly for species conservation, we performed a low-coverage resequencing of 30 individuals from wild populations and two zoo individuals. Our genome-wide population structure analyses detected genetic population structure in the wild and identifiedmore »the most likely origin of the zoo-housed individuals. We further identified putatively deleterious mutations to be monitored.


    We establish a high-quality chromosome-level reference genome for A. gigantea and one of the most complete turtle genomes available. We show that low-coverage whole-genome resequencing, for which alignment to the reference genome is a necessity, is a powerful tool to assess the population structure of the wild population and reveal the geographic origins of ex situ individuals relevant for genetic diversity management and rewilding efforts.

    « less
  5. Abstract

    Sequencing, assembly, and annotation of the 26.5 Gbp hexaploid genome of coast redwood (Sequoia sempervirens) was completed leading toward discovery of genes related to climate adaptation and investigation of the origin of the hexaploid genome. Deep-coverage short-read Illumina sequencing data from haploid tissue from a single seed were combined with long-read Oxford Nanopore Technologies sequencing data from diploid needle tissue to create an initial assembly, which was then scaffolded using proximity ligation data to produce a highly contiguous final assembly, SESE 2.1, with a scaffold N50 size of 44.9 Mbp. The assembly included several scaffolds that span entire chromosome arms, confirmed by the presence of telomere and centromere sequences on the ends of the scaffolds. The structural annotation produced 118,906 genes with 113 containing introns that exceed 500 Kbp in length and one reaching 2 Mb. Nearly 19 Gbp of the genome represented repetitive content with the vast majority characterized as long terminal repeats, with a 2.9:1 ratio of Copia to Gypsy elements that may aid in gene expression control. Comparison of coast redwood to other conifers revealed species-specific expansions for a plethora of abiotic and biotic stress response genes, including those involved in fungal disease resistance, detoxification, and physical injury/structural remodeling and othersmore »supporting flavonoid biosynthesis. Analysis of multiple genes that exist in triplicate in coast redwood but only once in its diploid relative, giant sequoia, supports a previous hypothesis that the hexaploidy is the result of autopolyploidy rather than any hybridizations with separate but closely related conifer species.

    « less