skip to main content


Title: Functional annotations of three domestic animal genomes provide vital resources for comparative and agricultural research
Abstract Gene regulatory elements are central drivers of phenotypic variation and thus of critical importance towards understanding the genetics of complex traits. The Functional Annotation of Animal Genomes consortium was formed to collaboratively annotate the functional elements in animal genomes, starting with domesticated animals. Here we present an expansive collection of datasets from eight diverse tissues in three important agricultural species: chicken ( Gallus gallus ), pig ( Sus scrofa ), and cattle ( Bos taurus ). Comparative analysis of these datasets and those from the human and mouse Encyclopedia of DNA Elements projects reveal that a core set of regulatory elements are functionally conserved independent of divergence between species, and that tissue-specific transcription factor occupancy at regulatory elements and their predicted target genes are also conserved. These datasets represent a unique opportunity for the emerging field of comparative epigenomics, as well as the agricultural research community, including species that are globally important food resources.  more » « less
Award ID(s):
1846559
NSF-PAR ID:
10232202
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; more » ; « less
Date Published:
Journal Name:
Nature Communications
Volume:
12
Issue:
1
ISSN:
2041-1723
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. INTRODUCTION A major challenge in genomics is discerning which bases among billions alter organismal phenotypes and affect health and disease risk. Evidence of past selective pressure on a base, whether highly conserved or fast evolving, is a marker of functional importance. Bases that are unchanged in all mammals may shape phenotypes that are essential for organismal health. Bases that are evolving quickly in some species, or changed only in species that share an adaptive trait, may shape phenotypes that support survival in specific niches. Identifying bases associated with exceptional capacity for cellular recovery, such as in species that hibernate, could inform therapeutic discovery. RATIONALE The power and resolution of evolutionary analyses scale with the number and diversity of species compared. By analyzing genomes for hundreds of placental mammals, we can detect which individual bases in the genome are exceptionally conserved (constrained) and likely to be functionally important in both coding and noncoding regions. By including species that represent all orders of placental mammals and aligning genomes using a method that does not require designating humans as the reference species, we explore unusual traits in other species. RESULTS Zoonomia’s mammalian comparative genomics resources are the most comprehensive and statistically well-powered produced to date, with a protein-coding alignment of 427 mammals and a whole-genome alignment of 240 placental mammals representing all orders. We estimate that at least 10.7% of the human genome is evolutionarily conserved relative to neutrally evolving repeats and identify about 101 million significantly constrained single bases (false discovery rate < 0.05). We cataloged 4552 ultraconserved elements at least 20 bases long that are identical in more than 98% of the 240 placental mammals. Many constrained bases have no known function, illustrating the potential for discovery using evolutionary measures. Eighty percent are outside protein-coding exons, and half have no functional annotations in the Encyclopedia of DNA Elements (ENCODE) resource. Constrained bases tend to vary less within human populations, which is consistent with purifying selection. Species threatened with extinction have few substitutions at constrained sites, possibly because severely deleterious alleles have been purged from their small populations. By pairing Zoonomia’s genomic resources with phenotype annotations, we find genomic elements associated with phenotypes that differ between species, including olfaction, hibernation, brain size, and vocal learning. We associate genomic traits, such as the number of olfactory receptor genes, with physical phenotypes, such as the number of olfactory turbinals. By comparing hibernators and nonhibernators, we implicate genes involved in mitochondrial disorders, protection against heat stress, and longevity in this physiologically intriguing phenotype. Using a machine learning–based approach that predicts tissue-specific cis - regulatory activity in hundreds of species using data from just a few, we associate changes in noncoding sequence with traits for which humans are exceptional: brain size and vocal learning. CONCLUSION Large-scale comparative genomics opens new opportunities to explore how genomes evolved as mammals adapted to a wide range of ecological niches and to discover what is shared across species and what is distinctively human. High-quality data for consistently defined phenotypes are necessary to realize this potential. Through partnerships with researchers in other fields, comparative genomics can address questions in human health and basic biology while guiding efforts to protect the biodiversity that is essential to these discoveries. Comparing genomes from 240 species to explore the evolution of placental mammals. Our new phylogeny (black lines) has alternating gray and white shading, which distinguishes mammalian orders (labeled around the perimeter). Rings around the phylogeny annotate species phenotypes. Seven species with diverse traits are illustrated, with black lines marking their branch in the phylogeny. Sequence conservation across species is described at the top left. IMAGE CREDIT: K. MORRILL 
    more » « less
  2. Seadragons are a remarkable lineage of teleost fishes in the family Syngnathidae, renowned for having evolved male pregnancy. Comprising three known species, seadragons are widely recognized and admired for their fantastical body forms and coloration, and their specific habitat requirements have made them flagship representatives for marine conservation and natural history interests. Until recently, a gap has been the lack of significant genomic resources for seadragons. We have produced gene-annotated, chromosome-scale genome models for the leafy and weedy seadragon to advance investigations of evolutionary innovation and elaboration of morphological traits in seadragons as well as their pipefish and seahorse relatives. We identified several interesting features specific to seadragon genomes, including divergent noncoding regions near a developmental gene important for integumentary outgrowth, a high genome-wide density of repetitive DNA, and recent expansions of transposable elements and a vesicular trafficking gene family. Surprisingly, comparative analyses leveraging the seadragon genomes and additional syngnathid and outgroup genomes revealed striking, syngnathid-specific losses in the family of fibroblast growth factors (FGFs), which likely involve reorganization of highly conserved gene regulatory networks in ways that have not previously been documented in natural populations. The resources presented here serve as important tools for future evolutionary studies of developmental processes in syngnathids and hold value for conservation of the extravagant seadragons and their relatives. 
    more » « less
  3. Abstract Environmental stress from ultraviolet radiation, elevated temperatures or metal toxicity can lead to reactive oxygen species in cells, leading to oxidative DNA damage, premature aging, neurodegenerative diseases, and cancer. The transcription factor nuclear factor (erythroid-derived 2)-like 2 (Nrf2) activates many cytoprotective proteins within the nucleus to maintain homeostasis during oxidative stress. In vertebrates, Nrf2 levels are regulated by the Kelch-family protein Keap1 (Kelch-like ECH-associated protein 1) in the absence of stress according to a canonical redox control pathway. Little, however, is known about the redox control pathway used in early diverging metazoans. Our study examines the presence of known oxidative stress regulatory elements within non-bilaterian metazoans including free living and parasitic cnidarians, ctenophores, placozoans, and sponges. Cnidarians, with their pivotal position as the sister phylum to bilaterians, play an important role in understanding the evolutionary history of response to oxidative stress. Through comparative genomic and transcriptomic analysis our results show that Nrf homologs evolved early in metazoans, whereas Keap1 appeared later in the last common ancestor of cnidarians and bilaterians. However, key Nrf–Keap1 interacting domains are not conserved within the cnidarian lineage, suggesting this important pathway evolved with the radiation of bilaterians. Several known downstream Nrf targets are present in cnidarians suggesting that cnidarian Nrf plays an important role in oxidative stress response even in the absence of Keap1. Comparative analyses of key oxidative stress sensing and response proteins in early diverging metazoans thus provide important insights into the molecular basis of how these lineages interact with their environment and suggest a shared evolutionary history of regulatory pathways. Exploration of these pathways may prove important for the study of cancer therapeutics and broader research in oxidative stress, senescence, and the functional responses of early diverging metazoans to environmental change. 
    more » « less
  4. Introduction

    Eukaryotic life depends on the functional elements encoded by both the nuclear genome and organellar genomes, such as those contained within the mitochondria. The content, size, and structure of the mitochondrial genome varies across organisms with potentially large implications for phenotypic variance and resulting evolutionary trajectories. Among yeasts in the subphylum Saccharomycotina, extensive differences have been observed in various species relative to the model yeastSaccharomyces cerevisiae, but mitochondrial genome sampling across many groups has been scarce, even as hundreds of nuclear genomes have become available.

    Methods

    By extracting mitochondrial assemblies from existing short-read genome sequence datasets, we have greatly expanded both the number of available genomes and the coverage across sparsely sampled clades.

    Results

    Comparison of 353 yeast mitochondrial genomes revealed that, while size and GC content were fairly consistent across species, those in the generaMetschnikowiaandSaccharomycestrended larger, while several species in the order Saccharomycetales, which includesS. cerevisiae, exhibited lower GC content. Extreme examples for both size and GC content were scattered throughout the subphylum. All mitochondrial genomes shared a core set of protein-coding genes for Complexes III, IV, and V, but they varied in the presence or absence of mitochondrially-encoded canonical Complex I genes. We traced the loss of Complex I genes to a major event in the ancestor of the orders Saccharomycetales and Saccharomycodales, but we also observed several independent losses in the orders Phaffomycetales, Pichiales, and Dipodascales. In contrast to prior hypotheses based on smaller-scale datasets, comparison of evolutionary rates in protein-coding genes showed no bias towards elevated rates among aerobically fermenting (Crabtree/Warburg-positive) yeasts. Mitochondrial introns were widely distributed, but they were highly enriched in some groups. The majority of mitochondrial introns were poorly conserved within groups, but several were shared within groups, between groups, and even across taxonomic orders, which is consistent with horizontal gene transfer, likely involving homing endonucleases acting as selfish elements.

    Discussion

    As the number of available fungal nuclear genomes continues to expand, the methods described here to retrieve mitochondrial genome sequences from these datasets will prove invaluable to ensuring that studies of fungal mitochondrial genomes keep pace with their nuclear counterparts.

     
    more » « less
  5. null (Ed.)
    Abstract The reconstruction of bacterial and archaeal genomes from shotgun metagenomes has enabled insights into the ecology and evolution of environmental and host-associated microbiomes. Here we applied this approach to >10,000 metagenomes collected from diverse habitats covering all of Earth’s continents and oceans, including metagenomes from human and animal hosts, engineered environments, and natural and agricultural soils, to capture extant microbial, metabolic and functional potential. This comprehensive catalog includes 52,515 metagenome-assembled genomes representing 12,556 novel candidate species-level operational taxonomic units spanning 135 phyla. The catalog expands the known phylogenetic diversity of bacteria and archaea by 44% and is broadly available for streamlined comparative analyses, interactive exploration, metabolic modeling and bulk download. We demonstrate the utility of this collection for understanding secondary-metabolite biosynthetic potential and for resolving thousands of new host linkages to uncultivated viruses. This resource underscores the value of genome-centric approaches for revealing genomic properties of uncultivated microorganisms that affect ecosystem processes. 
    more » « less