skip to main content

Title: Chromosome-level genome of the wood stork ( Mycteria americana ) provides insight into avian chromosome evolution

Despite being quite specious (~10,000 extant species), birds have a fairly uniform genome size and karyotype (including the common occurrence of microchromosomes) relative to other vertebrate lineages. Storks (Family Ciconiidae) are a charismatic and distinct group of large wading birds with nearly worldwide distribution but few genomic resources. Here we present an annotated chromosome-level reference genome and chromosome orthology analysis for the wood stork (Mycteria americana), a species that has been federally protected under the Endangered Species Act since 1984. The annotated chromosome-level reference assembly was produced using the blood of a wild female wood stork chick, has a length of 1.35 Gb, a contig N50 of 37 Mb, a scaffold N50 of 80 Mb, and a BUSCO score of 98.8%. We identified 31 autosomal pairs and two sex chromosomes in the wood stork genome, but failed to identify four additional autosomal microchromosomes previously found via karyotyping. Orthology analyses confirmed reported synapomorphies unique to storks and identified the chromosomes participating in these fusions. This study highlights the difficulty and potential problems associated with delineating microchromosomes in reference genome assemblies. It also provides a foundation for studying karyotype evolution in the core water bird clade that includes penguins, albatrosses, storks, cormorants, herons, and ibises. Finally, our reference genome will allow for numerous genomic studies, such as genome-wide association studies of local adaptation, that will aid in wood stork conservation.

more » « less
Award ID(s):
Author(s) / Creator(s):
; ;
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
Journal of Heredity
Medium: X Size: p. 230-239
["p. 230-239"]
Sponsoring Org:
National Science Foundation
More Like this
  1. Sethuraman, A (Ed.)
    Abstract Spiny lizards in the genus Sceloporus are a model system among squamate reptiles for studies of chromosomal evolution. While most pleurodont iguanians retain an ancestral karyotype formula of 2n = 36 chromosomes, Sceloporus exhibits substantial karyotype variation ranging from 2n =  22 to 46 chromosomes. We present two annotated chromosome-scale genome assemblies for the Plateau Fence Lizard (Sceloporus tristichus) to facilitate research on the role of pericentric inversion polymorphisms on adaptation and speciation. Based on previous karyotype work using conventional staining, the S. tristichus genome is characterized as 2n =  22 with six pairs of macrochromosomes and five pairs of microchromosomes and a pericentric inversion polymorphism on chromosome 7 that is geographically variable. We provide annotated, chromosome-scale genomes for two lizards located at opposite ends of a dynamic hybrid zone that are each fixed for different inversion polymorphisms. The assembled genomes are 1.84–1.87 Gb (1.72 Gb for scaffolds mapping to chromosomes) with a scaffold N50 of 267.5 Mb. Functional annotation of the genomes resulted in ∼15K predicted gene models. Our assemblies confirmed the presence of a 4.62-Mb pericentric inversion on chromosome 7, which contains 62 annotated coding genes with known functions. In addition, we collected population genomics data using double digest RAD-sequencing for 44 S. tristichus to estimate population structure and phylogeny across the Colorado Plateau. These new genomic resources provide opportunities to perform genomic scans and investigate the formation and spread of pericentric inversions in a naturally occurring hybrid zone. 
    more » « less
  2. Abstract Background

    The increasing number of chromosome-level genome assemblies has advanced our knowledge and understanding of macroevolutionary processes. Here, we introduce the genome of the desert horned lizard, Phrynosoma platyrhinos, an iguanid lizard occupying extreme desert conditions of the American southwest. We conduct analysis of the chromosomal structure and composition of this species and compare these features across genomes of 12 other reptiles (5 species of lizards, 3 snakes, 3 turtles, and 1 bird).


    The desert horned lizard genome was sequenced using Illumina paired-end reads and assembled and scaffolded using Dovetail Genomics Hi-C and Chicago long-range contact data. The resulting genome assembly has a total length of 1,901.85 Mb, scaffold N50 length of 273.213 Mb, and includes 5,294 scaffolds. The chromosome-level assembly is composed of 6 macrochromosomes and 11 microchromosomes. A total of 20,764 genes were annotated in the assembly. GC content and gene density are higher for microchromosomes than macrochromosomes, while repeat element distributions show the opposite trend. Pathway analyses provide preliminary evidence that microchromosome and macrochromosome gene content are functionally distinct. Synteny analysis indicates that large microchromosome blocks are conserved among closely related species, whereas macrochromosomes show evidence of frequent fusion and fission events among reptiles, even between closely related species.


    Our results demonstrate dynamic karyotypic evolution across Reptilia, with frequent inferred splits, fusions, and rearrangements that have resulted in shuffling of chromosomal blocks between macrochromosomes and microchromosomes. Our analyses also provide new evidence for distinct gene content and chromosomal structure between microchromosomes and macrochromosomes within reptiles.

    more » « less
  3. O’Neill, Rachel (Ed.)
    Abstract Echinometra is the most widespread genus of sea urchin and has been the focus of a wide range of studies in ecology, speciation, and reproduction. However, available genetic data for this genus are generally limited to a few select loci. Here, we present a chromosome-level genome assembly based on 10x Genomics, PacBio, and Hi-C sequencing for Echinometra sp. EZ from the Persian/Arabian Gulf. The genome is assembled into 210 scaffolds totaling 817.8 Mb with an N50 of 39.5 Mb. From this assembly, we determined that the E. sp. EZ genome consists of 2n = 42 chromosomes. BUSCO analysis showed that 95.3% of BUSCO genes were complete. Ab initio and transcript-informed gene modeling and annotation identified 29,405 genes, including a conserved Hox cluster. E. sp. EZ can be found in high-temperature and high-salinity environments, and we therefore compared E. sp. EZ gene families and transcription factors associated with environmental stress response (“defensome”) with other echinoid species with similar high-quality genomic resources. While the number of defensome genes was broadly similar for all species, we identified strong signatures of positive selection in E. sp. EZ noncoding elements near genes involved in environmental response pathways as well as losses of transcription factors important for environmental response. These data provide key insights into the biology of E. sp. EZ as well as the diversification of Echinometra more widely and will serve as a useful tool for the community to explore questions in this taxonomic group and beyond. 
    more » « less
  4. Abstract

    White clover (Trifolium repens L.; Fabaceae) is an important forage and cover crop in agricultural pastures around the world and is increasingly used in evolutionary ecology and genetics to understand the genetic basis of adaptation. Historically, improvements in white clover breeding practices and assessments of genetic variation in nature have been hampered by a lack of high-quality genomic resources for this species, owing in part to its high heterozygosity and allotetraploid hybrid origin. Here, we use PacBio HiFi and chromosome conformation capture (Omni-C) technologies to generate a chromosome-level, haplotype-resolved genome assembly for white clover totaling 998 Mbp (scaffold N50 = 59.3 Mbp) and 1 Gbp (scaffold N50 = 58.6 Mbp) for haplotypes 1 and 2, respectively, with each haplotype arranged into 16 chromosomes (8 per subgenome). We additionally provide a functionally annotated haploid mapping assembly (968 Mbp, scaffold N50 = 59.9 Mbp), which drastically improves on the existing reference assembly in both contiguity and assembly accuracy. We annotated 78,174 protein-coding genes, resulting in protein BUSCO completeness scores of 99.6% and 99.3% against the embryophyta_odb10 and fabales_odb10 lineage datasets, respectively.

    more » « less
  5. Abstract

    The angiosperm genus Silene has been the subject of extensive study in the field of ecology and evolution, but the availability of high-quality reference genome sequences has been limited for this group. Here, we report a chromosome-level assembly for the genome of Silene conica based on Pacific Bioscience HiFi, Hi-C, and Bionano technologies. The assembly produced 10 scaffolds (1 per chromosome) with a total length of 862 Mb and only ∼1% gap content. These results confirm previous observations that S. conica and its relatives have a reduced base chromosome number relative to the genus's ancestral state of 12. Silene conica has an exceptionally large mitochondrial genome (>11 Mb), predominantly consisting of sequence of unknown origins. Analysis of shared sequence content suggests that it is unlikely that transfer of nuclear DNA is the primary driver of this mitochondrial genome expansion. More generally, this assembly should provide a valuable resource for future genomic studies in Silene, including comparative analyses with related species that recently evolved sex chromosomes.

    more » « less