skip to main content


Title: Leafy and weedy seadragon genomes connect genic and repetitive DNA features to the extravagant biology of syngnathid fishes
Seadragons are a remarkable lineage of teleost fishes in the family Syngnathidae, renowned for having evolved male pregnancy. Comprising three known species, seadragons are widely recognized and admired for their fantastical body forms and coloration, and their specific habitat requirements have made them flagship representatives for marine conservation and natural history interests. Until recently, a gap has been the lack of significant genomic resources for seadragons. We have produced gene-annotated, chromosome-scale genome models for the leafy and weedy seadragon to advance investigations of evolutionary innovation and elaboration of morphological traits in seadragons as well as their pipefish and seahorse relatives. We identified several interesting features specific to seadragon genomes, including divergent noncoding regions near a developmental gene important for integumentary outgrowth, a high genome-wide density of repetitive DNA, and recent expansions of transposable elements and a vesicular trafficking gene family. Surprisingly, comparative analyses leveraging the seadragon genomes and additional syngnathid and outgroup genomes revealed striking, syngnathid-specific losses in the family of fibroblast growth factors (FGFs), which likely involve reorganization of highly conserved gene regulatory networks in ways that have not previously been documented in natural populations. The resources presented here serve as important tools for future evolutionary studies of developmental processes in syngnathids and hold value for conservation of the extravagant seadragons and their relatives.  more » « less
Award ID(s):
2015301
NSF-PAR ID:
10350519
Author(s) / Creator(s):
; ; ; ; ; ; ;
Date Published:
Journal Name:
Proceedings of the National Academy of Sciences
Volume:
119
Issue:
26
ISSN:
0027-8424
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Comparisons of high-quality, reference butterfly, and moth genomes have been instrumental to advancing our understanding of how hybridization, and natural selection drive genomic change during the origin of new species and novel traits. Here, we present a genome assembly of the Southern Dogface butterfly, Zerene cesonia (Pieridae) whose brilliant wing colorations have been implicated in developmental plasticity, hybridization, sexual selection, and speciation. We assembled 266,407,278 bp of the Z. cesonia genome, which accounts for 98.3% of the estimated 271 Mb genome size. Using a hybrid approach involving Chicago libraries with Hi-Rise assembly and a diploid Meraculous assembly, the final haploid genome was assembled. In the final assembly, nearly all autosomes and the Z chromosome were assembled into single scaffolds. The largest 29 scaffolds accounted for 91.4% of the genome assembly, with the remaining ∼8% distributed among another 247 scaffolds and overall N50 of 9.2 Mb. Tissue-specific RNA-seq informed annotations identified 16,442 protein-coding genes, which included 93.2% of the arthropod Benchmarking Universal Single-Copy Orthologs (BUSCO). The Z. cesonia genome assembly had ∼9% identified as repetitive elements, with a transposable element landscape rich in helitrons. Similar to other Lepidoptera genomes, Z. cesonia showed a high conservation of chromosomal synteny. The Z. cesonia assembly provides a high-quality reference for studies of chromosomal arrangements in the Pierid family, as well as for population, phylo, and functional genomic studies of adaptation and speciation. 
    more » « less
  2. Synopsis

    The proliferation of genomic resources for Chelicerata in the past 10 years has revealed that the evolution of chelicerate genomes is more dynamic than previously thought, with multiple waves of ancient whole genome duplications affecting separate lineages. Such duplication events are fascinating from the perspective of evolutionary history because the burst of new gene copies associated with genome duplications facilitates the acquisition of new gene functions (neofunctionalization), which may in turn lead to morphological novelties and spur net diversification. While neofunctionalization has been invoked in several contexts with respect to the success and diversity of spiders, the overall impact of whole genome duplications on chelicerate evolution and development remains imperfectly understood. The purpose of this review is to examine critically the role of whole genome duplication on the diversification of the extant arachnid orders, as well as assess functional datasets for evidence of subfunctionalization or neofunctionalization in chelicerates. This examination focuses on functional data from two focal model taxa: the spider Parasteatoda tepidariorum, which exhibits evidence for an ancient duplication, and the harvestman Phalangium opilio, which exhibits an unduplicated genome. I show that there is no evidence that taxa with genome duplications are more successful than taxa with unduplicated genomes. I contend that evidence for sub- or neofunctionalization of duplicated developmental patterning genes in spiders is indirect or fragmentary at present, despite the appeal of this postulate for explaining the success of groups like spiders. Available expression data suggest that the condition of duplicated Hox modules may have played a role in promoting body plan disparity in the posterior tagma of some orders, such as spiders and scorpions, but functional data substantiating this postulate are critically missing. Spatiotemporal dynamics of duplicated transcription factors in spiders may represent cases of developmental system drift, rather than neofunctionalization. Developmental system drift may represent an important, but overlooked, null hypothesis for studies of paralogs in chelicerate developmental biology. To distinguish between subfunctionalization, neofunctionalization, and developmental system drift, concomitant establishment of comparative functional datasets from taxa exhibiting the genome duplication, as well as those that lack the paralogy, is sorely needed.

     
    more » « less
  3. INTRODUCTION A major challenge in genomics is discerning which bases among billions alter organismal phenotypes and affect health and disease risk. Evidence of past selective pressure on a base, whether highly conserved or fast evolving, is a marker of functional importance. Bases that are unchanged in all mammals may shape phenotypes that are essential for organismal health. Bases that are evolving quickly in some species, or changed only in species that share an adaptive trait, may shape phenotypes that support survival in specific niches. Identifying bases associated with exceptional capacity for cellular recovery, such as in species that hibernate, could inform therapeutic discovery. RATIONALE The power and resolution of evolutionary analyses scale with the number and diversity of species compared. By analyzing genomes for hundreds of placental mammals, we can detect which individual bases in the genome are exceptionally conserved (constrained) and likely to be functionally important in both coding and noncoding regions. By including species that represent all orders of placental mammals and aligning genomes using a method that does not require designating humans as the reference species, we explore unusual traits in other species. RESULTS Zoonomia’s mammalian comparative genomics resources are the most comprehensive and statistically well-powered produced to date, with a protein-coding alignment of 427 mammals and a whole-genome alignment of 240 placental mammals representing all orders. We estimate that at least 10.7% of the human genome is evolutionarily conserved relative to neutrally evolving repeats and identify about 101 million significantly constrained single bases (false discovery rate < 0.05). We cataloged 4552 ultraconserved elements at least 20 bases long that are identical in more than 98% of the 240 placental mammals. Many constrained bases have no known function, illustrating the potential for discovery using evolutionary measures. Eighty percent are outside protein-coding exons, and half have no functional annotations in the Encyclopedia of DNA Elements (ENCODE) resource. Constrained bases tend to vary less within human populations, which is consistent with purifying selection. Species threatened with extinction have few substitutions at constrained sites, possibly because severely deleterious alleles have been purged from their small populations. By pairing Zoonomia’s genomic resources with phenotype annotations, we find genomic elements associated with phenotypes that differ between species, including olfaction, hibernation, brain size, and vocal learning. We associate genomic traits, such as the number of olfactory receptor genes, with physical phenotypes, such as the number of olfactory turbinals. By comparing hibernators and nonhibernators, we implicate genes involved in mitochondrial disorders, protection against heat stress, and longevity in this physiologically intriguing phenotype. Using a machine learning–based approach that predicts tissue-specific cis - regulatory activity in hundreds of species using data from just a few, we associate changes in noncoding sequence with traits for which humans are exceptional: brain size and vocal learning. CONCLUSION Large-scale comparative genomics opens new opportunities to explore how genomes evolved as mammals adapted to a wide range of ecological niches and to discover what is shared across species and what is distinctively human. High-quality data for consistently defined phenotypes are necessary to realize this potential. Through partnerships with researchers in other fields, comparative genomics can address questions in human health and basic biology while guiding efforts to protect the biodiversity that is essential to these discoveries. Comparing genomes from 240 species to explore the evolution of placental mammals. Our new phylogeny (black lines) has alternating gray and white shading, which distinguishes mammalian orders (labeled around the perimeter). Rings around the phylogeny annotate species phenotypes. Seven species with diverse traits are illustrated, with black lines marking their branch in the phylogeny. Sequence conservation across species is described at the top left. IMAGE CREDIT: K. MORRILL 
    more » « less
  4. Gaut, Brandon (Ed.)
    Abstract As the closest extant sister group to seed plants, ferns are an important reference point to study the origin and evolution of plant genes and traits. One bottleneck to the use of ferns in phylogenetic and genetic studies is the fact that genome-level sequence information of this group is limited, due to the extreme genome sizes of most ferns. Ceratopteris richardii (hereafter Ceratopteris) has been widely used as a model system for ferns. In this study, we generated a transcriptome of Ceratopteris, through the de novo assembly of the RNA-seq data from 17 sequencing libraries that are derived from two sexual types of gametophytes and five different sporophyte tissues. The Ceratopteris transcriptome, together with 38 genomes and transcriptomes from other species across the Viridiplantae, were used to uncover the evolutionary dynamics of orthogroups (predicted gene families using OrthoFinder) within the euphyllophytes and identify proteins associated with the major shifts in plant morphology and physiology that occurred in the last common ancestors of euphyllophytes, ferns, and seed plants. Furthermore, this resource was used to identify and classify the GRAS domain transcriptional regulators of many developmental processes in plants. Through the phylogenetic analysis within each of the 15 GRAS orthogroups, we uncovered which GRAS family members are conserved or have diversified in ferns and seed plants. Taken together, the transcriptome database and analyses reported here provide an important platform for exploring the evolution of gene families in land plants and for studying gene function in seed-free vascular plants. 
    more » « less
  5. Wong, A (Ed.)
    Abstract Bacteriophages infecting pathogenic hosts play an important role in medical research, not only as potential treatments for antibiotic-resistant infections but also offering novel insights into pathogen genetics and evolution. A prominent example is cluster K mycobacteriophages infecting Mycobacterium tuberculosis, a causative agent of tuberculosis in humans. However, as handling M. tuberculosis as well as other pathogens in a laboratory remains challenging, alternative nonpathogenic relatives, such as Mycobacterium smegmatis, are frequently used as surrogates to discover therapeutically relevant bacteriophages in a safer environment. Consequently, the individual host ranges of the majority of cluster K mycobacteriophages identified to date remain poorly understood. Here, we characterized the complete genome of Stinson, a temperate subcluster K1 mycobacteriophage with a siphoviral morphology. A series of comparative genomic analyses revealed strong similarities with other cluster K mycobacteriophages, including the conservation of an immunity repressor gene and a toxin/antitoxin gene pair. Patterns of codon usage bias across the cluster offered important insights into putative host ranges in nature, highlighting that although all cluster K mycobacteriophages are able to infect M. tuberculosis, they are less likely to have shared an evolutionary infection history with Mycobacterium leprae (underlying leprosy) compared to the rest of the genus’ host species. Moreover, subcluster K1 mycobacteriophages are able to integrate into the genomes of Mycobacterium abscessus and Mycobacterium marinum—two bacteria causing pulmonary and cutaneous infections which are often difficult to treat due to their drug resistance. 
    more » « less