The Weddell seal (
- Award ID(s):
- NSF-PAR ID:
- Publisher / Repository:
- Nature Publishing Group
- Date Published:
- Journal Name:
- Communications Biology
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
INTRODUCTION A major challenge in genomics is discerning which bases among billions alter organismal phenotypes and affect health and disease risk. Evidence of past selective pressure on a base, whether highly conserved or fast evolving, is a marker of functional importance. Bases that are unchanged in all mammals may shape phenotypes that are essential for organismal health. Bases that are evolving quickly in some species, or changed only in species that share an adaptive trait, may shape phenotypes that support survival in specific niches. Identifying bases associated with exceptional capacity for cellular recovery, such as in species that hibernate, could inform therapeutic discovery. RATIONALE The power and resolution of evolutionary analyses scale with the number and diversity of species compared. By analyzing genomes for hundreds of placental mammals, we can detect which individual bases in the genome are exceptionally conserved (constrained) and likely to be functionally important in both coding and noncoding regions. By including species that represent all orders of placental mammals and aligning genomes using a method that does not require designating humans as the reference species, we explore unusual traits in other species. RESULTS Zoonomia’s mammalian comparative genomics resources are the most comprehensive and statistically well-powered produced to date, with a protein-coding alignment of 427 mammals and a whole-genome alignment of 240 placental mammals representing all orders. We estimate that at least 10.7% of the human genome is evolutionarily conserved relative to neutrally evolving repeats and identify about 101 million significantly constrained single bases (false discovery rate < 0.05). We cataloged 4552 ultraconserved elements at least 20 bases long that are identical in more than 98% of the 240 placental mammals. Many constrained bases have no known function, illustrating the potential for discovery using evolutionary measures. Eighty percent are outside protein-coding exons, and half have no functional annotations in the Encyclopedia of DNA Elements (ENCODE) resource. Constrained bases tend to vary less within human populations, which is consistent with purifying selection. Species threatened with extinction have few substitutions at constrained sites, possibly because severely deleterious alleles have been purged from their small populations. By pairing Zoonomia’s genomic resources with phenotype annotations, we find genomic elements associated with phenotypes that differ between species, including olfaction, hibernation, brain size, and vocal learning. We associate genomic traits, such as the number of olfactory receptor genes, with physical phenotypes, such as the number of olfactory turbinals. By comparing hibernators and nonhibernators, we implicate genes involved in mitochondrial disorders, protection against heat stress, and longevity in this physiologically intriguing phenotype. Using a machine learning–based approach that predicts tissue-specific cis - regulatory activity in hundreds of species using data from just a few, we associate changes in noncoding sequence with traits for which humans are exceptional: brain size and vocal learning. CONCLUSION Large-scale comparative genomics opens new opportunities to explore how genomes evolved as mammals adapted to a wide range of ecological niches and to discover what is shared across species and what is distinctively human. High-quality data for consistently defined phenotypes are necessary to realize this potential. Through partnerships with researchers in other fields, comparative genomics can address questions in human health and basic biology while guiding efforts to protect the biodiversity that is essential to these discoveries. Comparing genomes from 240 species to explore the evolution of placental mammals. Our new phylogeny (black lines) has alternating gray and white shading, which distinguishes mammalian orders (labeled around the perimeter). Rings around the phylogeny annotate species phenotypes. Seven species with diverse traits are illustrated, with black lines marking their branch in the phylogeny. Sequence conservation across species is described at the top left. IMAGE CREDIT: K. MORRILLmore » « less
Nitric oxide (NO) is a potent vasodilator, which improves perfusion and oxygen delivery during tissue hypoxia in terrestrial animals. The vertebrate dive response involves vasoconstriction in select tissues, which persists despite profound hypoxia. Using tissues collected from Weddell seals at necropsy, we investigated whether vasoconstriction is aided by downregulation of local hypoxia signaling mechanisms. We focused on NO–soluble guanylyl cyclase (GC)-cGMP signaling, a well-known vasodilatory transduction pathway. Seals have a lower GC protein abundance, activity, and capacity to respond to NO stimulation than do terrestrial mammals. In seal lung homogenates, GC produced less cGMP (20.1 ± 3.7 pmol·mg protein−1·min−1) than the lungs of dogs (−80 ± 144 pmol·mg protein−1·min−1 less than seals), sheep (−472 ± 96), rats (−664 ± 104) or mice (−1,160 ± 104, P < 0.0001). Amino acid sequences of the GC enzyme α-subunits differed between seals and terrestrial mammals, potentially affecting their structure and function. Vasoconstriction in diving Weddell seals is not consistent across tissues; perfusion is maintained in the brain and heart but decreased in other organs such as the kidney. A NO donor increased median GC activity 49.5-fold in the seal brain but only 27.4-fold in the kidney, consistent with the priority of cerebral perfusion during diving. Nos3 expression was high in the seal brain, which could improve NO production and vasodilatory potential. Conversely, Pde5a expression was high in the seal renal artery, which may increase cGMP breakdown and vasoconstriction in the kidney. Taken together, the results of this study suggest that alterations in the NO-cGMP pathway facilitate the diving response.more » « less
INTRODUCTION Diverse phenotypes, including large brains relative to body size, group living, and vocal learning ability, have evolved multiple times throughout mammalian history. These shared phenotypes may have arisen repeatedly by means of common mechanisms discernible through genome comparisons. RATIONALE Protein-coding sequence differences have failed to fully explain the evolution of multiple mammalian phenotypes. This suggests that these phenotypes have evolved at least in part through changes in gene expression, meaning that their differences across species may be caused by differences in genome sequence at enhancer regions that control gene expression in specific tissues and cell types. Yet the enhancers involved in phenotype evolution are largely unknown. Sequence conservation–based approaches for identifying such enhancers are limited because enhancer activity can be conserved even when the individual nucleotides within the sequence are poorly conserved. This is due to an overwhelming number of cases where nucleotides turn over at a high rate, but a similar combination of transcription factor binding sites and other sequence features can be maintained across millions of years of evolution, allowing the function of the enhancer to be conserved in a particular cell type or tissue. Experimentally measuring the function of orthologous enhancers across dozens of species is currently infeasible, but new machine learning methods make it possible to make reliable sequence-based predictions of enhancer function across species in specific tissues and cell types. RESULTS To overcome the limits of studying individual nucleotides, we developed the Tissue-Aware Conservation Inference Toolkit (TACIT). Rather than measuring the extent to which individual nucleotides are conserved across a region, TACIT uses machine learning to test whether the function of a given part of the genome is likely to be conserved. More specifically, convolutional neural networks learn the tissue- or cell type–specific regulatory code connecting genome sequence to enhancer activity using candidate enhancers identified from only a few species. This approach allows us to accurately associate differences between species in tissue or cell type–specific enhancer activity with genome sequence differences at enhancer orthologs. We then connect these predictions of enhancer function to phenotypes across hundreds of mammals in a way that accounts for species’ phylogenetic relatedness. We applied TACIT to identify candidate enhancers from motor cortex and parvalbumin neuron open chromatin data that are associated with brain size relative to body size, solitary living, and vocal learning across 222 mammals. Our results include the identification of multiple candidate enhancers associated with brain size relative to body size, several of which are located in linear or three-dimensional proximity to genes whose protein-coding mutations have been implicated in microcephaly or macrocephaly in humans. We also identified candidate enhancers associated with the evolution of solitary living near a gene implicated in separation anxiety and other enhancers associated with the evolution of vocal learning ability. We obtained distinct results for bulk motor cortex and parvalbumin neurons, demonstrating the value in applying TACIT to both bulk tissue and specific minority cell type populations. To facilitate future analyses of our results and applications of TACIT, we released predicted enhancer activity of >400,000 candidate enhancers in each of 222 mammals and their associations with the phenotypes we investigated. CONCLUSION TACIT leverages predicted enhancer activity conservation rather than nucleotide-level conservation to connect genetic sequence differences between species to phenotypes across large numbers of mammals. TACIT can be applied to any phenotype with enhancer activity data available from at least a few species in a relevant tissue or cell type and a whole-genome alignment available across dozens of species with substantial phenotypic variation. Although we developed TACIT for transcriptional enhancers, it could also be applied to genomic regions involved in other components of gene regulation, such as promoters and splicing enhancers and silencers. As the number of sequenced genomes grows, machine learning approaches such as TACIT have the potential to help make sense of how conservation of, or changes in, subtle genome patterns can help explain phenotype evolution. Tissue-Aware Conservation Inference Toolkit (TACIT) associates genetic differences between species with phenotypes. TACIT works by generating open chromatin data from a few species in a tissue related to a phenotype, using the sequences underlying open and closed chromatin regions to train a machine learning model for predicting tissue-specific open chromatin and associating open chromatin predictions across dozens of mammals with the phenotype. [Species silhouettes are from PhyloPic]more » « less
Genome-wide functional genetic screens have been successful in discovering genotype-phenotype relationships and in engineering new phenotypes. While broadly applied in mammalian cell lines and in
E. coli, use in non-conventional microorganisms has been limited, in part, due to the inability to accurately design high activity CRISPR guides in such species. Here, we develop an experimental-computational approach to sgRNA design that is specific to an organism of choice, in this case the oleaginous yeast Yarrowia lipolytica. A negative selection screen in the absence of non-homologous end-joining, the dominant DNA repair mechanism, was used to generate single guide RNA (sgRNA) activity profiles for both SpCas9 and LbCas12a. This genome-wide data served as input to a deep learning algorithm, DeepGuide, that is able to accurately predict guide activity. DeepGuide uses unsupervised learning to obtain a compressed representation of the genome, followed by supervised learning to map sgRNA sequence, genomic context, and epigenetic features with guide activity. Experimental validation, both genome-wide and with a subset of selected genes, confirms DeepGuide’s ability to accurately predict high activity sgRNAs. DeepGuide provides an organism specific predictor of CRISPR guide activity that with retraining could be applied to other fungal species, prokaryotes, and other non-conventional organisms.
Environmental fluctuation during embryonic and fetal development can permanently alter an organism’s morphology, physiology, and behaviour. This phenomenon, known as developmental plasticity, is particularly relevant to reptiles that develop in subterranean nests with variable oxygen tensions. Previous work has shown hypoxia permanently alters the cardiovascular system of snapping turtles and may improve cardiac anoxia tolerance later in life. The mechanisms driving this process are unknown but may involve epigenetic regulation of gene expression via DNA methylation. To test this hypothesis, we assessed in situ cardiac performance during 2 h of acute anoxia in juvenile turtles previously exposed to normoxia (21% oxygen) or hypoxia (10% oxygen) during embryogenesis. Next, we analysed DNA methylation and gene expression patterns in turtles from the same cohorts using whole genome bisulfite sequencing, which represents the first high-resolution investigation of DNA methylation patterns in any reptilian species.
Genome-wide correlations between CpG and CpG island methylation and gene expression patterns in the snapping turtle were consistent with patterns observed in mammals. As hypothesized, developmental hypoxia increased juvenile turtle cardiac anoxia tolerance and programmed DNA methylation and gene expression patterns. Programmed differences in expression of genes such as
SCN5Amay account for differences in heart rate, while genes such as TNNT2and TPM3may underlie differences in calcium sensitivity and contractility of cardiomyocytes and cardiac inotropy. Finally, we identified putative transcription factor-binding sites in promoters and in differentially methylated CpG islands that suggest a model linking programming of DNA methylation during embryogenesis to differential gene expression and cardiovascular physiology later in life. Binding sites for hypoxia inducible factors (HIF1A, ARNT, and EPAS1) and key transcription factors activated by MAPK and BMP signaling (RREB1 and SMAD4) are implicated. Conclusions
Our data strongly suggests that DNA methylation plays a conserved role in the regulation of gene expression in reptiles. We also show that embryonic hypoxia programs DNA methylation and gene expression patterns and that these changes are associated with enhanced cardiac anoxia tolerance later in life. Programming of cardiac anoxia tolerance has major ecological implications for snapping turtles, because these animals regularly exploit anoxic environments throughout their lifespan.