skip to main content

Title: A comparative genomics multitool for scientific discovery and conservation
The Zoonomia Project is investigating the genomics of shared and specialized traits in eutherian mammals. Here we provide genome assemblies for 131 species, of which all but 9 are previously uncharacterized, and describe a whole-genome alignment of 240 species of considerable phylogenetic diversity, comprising representatives from more than 80% of mammalian families. We find that regions of reduced genetic diversity are more abundant in species at a high risk of extinction, discern signals of evolutionary selection at high resolution and provide insights from individual reference genomes. By prioritizing phylogenetic diversity and making data available quickly and without restriction, the Zoonomia Project aims to support biological discovery, medical research and the conservation of biodiversity.
Authors:
; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; more » ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; « less
Award ID(s):
2029774 1753760 1838283
Publication Date:
NSF-PAR ID:
10248922
Journal Name:
Nature
Volume:
587
Issue:
7833
Page Range or eLocation-ID:
240 to 245
ISSN:
0028-0836
Sponsoring Org:
National Science Foundation
More Like this
  1. Polyploidy is widely acknowledged to have played an important role in the evolution and diversification of vascular plants. However, the influence of genome duplication on population-level dynamics and its cascading effects at the community level remain unclear. In part, this is due to persistent uncertainties over the extent of polyploid phenotypic variation, and the interactions between polyploids and co-occurring species, and highlights the need to integrate polyploid research at the population and community level. Here, we investigate how community-level patterns of phylogenetic relatedness might influence escape from minority cytotype exclusion, a classic population genetics hypothesis about polyploid establishment, and population-level species interactions. Focusing on two plant families in which polyploidy has evolved multiple times, Brassicaceae and Rosaceae, we build upon the hypothesis that the greater allelic and phenotypic diversity of polyploids allow them to successfully inhabit a different geographic range compared to their diploid progenitor and close relatives. Using a phylogenetic framework, we specifically test (1) whether polyploid species are more distantly related to diploids within the same community than co-occurring diploids are to one another, and (2) if polyploid species tend to exhibit greater ecological success than diploids, using species abundance in communities as an indicator of successful establishment.more »Overall, our results suggest that the effects of genome duplication on community structure are not clear-cut. We find that polyploid species tend to be more distantly related to co-occurring diploids than diploids are to each other. However, we do not find a consistent pattern of polyploid species being more abundant than diploid species, suggesting polyploids are not uniformly more ecologically successful than diploids. While polyploidy appears to have some important influences on species co-occurrence in Brassicaceae and Rosaceae communities, our study highlights the paucity of available geographically explicit data on intraspecific ploidal variation. The increased use of high-throughput methods to identify ploidal variation, such as flow cytometry and whole genome sequencing, will greatly aid our understanding of how such a widespread, radical genomic mutation influences the evolution of species and those around them.« less
  2. Dinoflagellates of the family Symbiodiniaceae are predominantly essential symbionts of corals and other marine organisms. Recent research reveals extensive genome sequence divergence among Symbiodiniaceae taxa and high phylogenetic diversity hidden behind subtly different cell morphologies. Using an alignment-free phylogenetic approach based on sub-sequences of fixed length k (i.e. k -mers), we assessed the phylogenetic signal among whole-genome sequences from 16 Symbiodiniaceae taxa (including the genera of Symbiodinium , Breviolum , Cladocopium , Durusdinium and Fugacium ) and two strains of Polarella glacialis as outgroup. Based on phylogenetic trees inferred from k -mers in distinct genomic regions (i.e. repeat-masked genome sequences, protein-coding sequences, introns and repeats) and in protein sequences, the phylogenetic signal associated with protein-coding DNA and the encoded amino acids is largely consistent with the Symbiodiniaceae phylogeny based on established markers, such as large subunit rRNA. The other genome sequences (introns and repeats) exhibit distinct phylogenetic signals, supporting the expected differential evolutionary pressure acting on these regions. Our analysis of conserved core k -mers revealed the prevalence of conserved k -mers (>95% core 23-mers among all 18 genomes) in annotated repeats and non-genic regions of the genomes. We observed 180 distinct repeat types that are significantly enriched in genomesmore »of the symbiotic versus free-living Symbiodinium taxa, suggesting an enhanced activity of transposable elements linked to the symbiotic lifestyle. We provide evidence that representation of alignment-free phylogenies as dynamic networks enhances the ability to generate new hypotheses about genome evolution in Symbiodiniaceae. These results demonstrate the potential of alignment-free phylogenetic methods as a scalable approach for inferring comprehensive, unbiased whole-genome phylogenies of dinoflagellates and more broadly of microbial eukaryotes.« less
  3. In cryptic amphibian complexes, there is a growing trend to equate high levels of genetic structure with hidden cryptic species diversity. Typically, phylogenetic structure and distance-based approaches are used to demonstrate the distinctness of clades and justify the recognition of new cryptic species. However, this approach does not account for gene flow, spatial, and environmental processes that can obfuscate phylogenetic inference and bias species delimitation. As a case study, we sequenced genome-wide exons and introns to evince the processes that underlie the diversification of Philippine Puddle Frogs—a group that is widespread, phenotypically conserved, and exhibits high levels of geographically based genetic structure. We showed that widely adopted tree- and distance-based approaches inferred up to 20 species, compared to genomic analyses that inferred an optimal number of five distinct genetic groups. Using a suite of clustering, admixture, and phylogenetic network analyses, we demonstrate extensive admixture among the five groups and elucidate two specificways in which gene flowcan cause overestimations of species diversity: 1) admixed populations can be inferred as distinct lineages characterized by long branches in phylograms; and 2) admixed lineages can appear to be genetically divergent, even from their parental populations when simple measures of genetic distance are used. Wemore »demonstrate that the relationship between mitochondrial and genome-wide nuclear p-distances is decoupled in admixed clades, leading to erroneous estimates of genetic distances and, consequently, species diversity. Additionally, genetic distance was also biased by spatial and environmental processes. Overall, we showed that high levels of genetic diversity in Philippine Puddle Frogs predominantly comprise metapopulation lineages that arose through complex patterns of admixture, isolation-bydistance, and isolation-by-environment as opposed to species divergence. Our findings suggest that speciation may not be the major process underlying the high levels of hidden diversity observed in many taxonomic groups and that widely adopted tree- and distance-based methods overestimate species diversity in the presence of gene flow.« less
  4. In cryptic amphibian complexes, there is a growing trend to equate high levels of genetic structure with hidden cryptic species diversity. Typically, phylogenetic structure and distance-based approaches are used to demonstrate the distinctness of clades and justify the recognition of new cryptic species. However, this approach does not account for gene flow, spatial, and environmental processes that can obfuscate phylogenetic inference and bias species delimitation. As a case study, we sequenced genome-wide exons and introns to evince the processes that underlie the diversification of Philippine Puddle Frogs—a group that is widespread, phenotypically conserved, and exhibits high levels of geographically based genetic structure. We showed that widely adopted tree- and distance-based approaches inferred up to 20 species, compared to genomic analyses that inferred an optimal number of five distinct genetic groups. Using a suite of clustering, admixture, and phylogenetic network analyses, we demonstrate extensive admixture among the five groups and elucidate two specificways in which gene flowcan cause overestimations of species diversity: 1) admixed populations can be inferred as distinct lineages characterized by long branches in phylograms; and 2) admixed lineages can appear to be genetically divergent, even from their parental populations when simple measures of genetic distance are used. Wemore »demonstrate that the relationship between mitochondrial and genome-wide nuclear p-distances is decoupled in admixed clades, leading to erroneous estimates of genetic distances and, consequently, species diversity. Additionally, genetic distance was also biased by spatial and environmental processes. Overall, we showed that high levels of genetic diversity in Philippine Puddle Frogs predominantly comprise metapopulation lineages that arose through complex patterns of admixture, isolation-bydistance, and isolation-by-environment as opposed to species divergence. Our findings suggest that speciation may not be the major process underlying the high levels of hidden diversity observed in many taxonomic groups and that widely adopted tree- and distance-based methods overestimate species diversity in the presence of gene flow.« less
  5. Abstract Butterfly eyes are complex organs that are composed of a diversity of proteins and they play a central role in visual signaling and ultimately, speciation, and adaptation. Here, we utilized the whole eye transcriptome to obtain a more holistic view of the evolution of the butterfly eye while accounting for speciation events that co-occur with ancient hybridization. We sequenced and assembled transcriptomes from adult female eyes of eight species representing all major clades of the Heliconius genus and an additional outgroup species, Dryas iulia. We identified 4,042 orthologous genes shared across all transcriptome data sets and constructed a transcriptome-wide phylogeny, which revealed topological discordance with the mitochondrial phylogenetic tree in the Heliconius pupal mating clade. We then estimated introgression among lineages using additional genome data and found evidence for ancient hybridization leading to the common ancestor of Heliconius hortense and Heliconius clysonymus. We estimated the Ka/Ks ratio for each orthologous cluster and performed further tests to demonstrate genes showing evidence of adaptive protein evolution. Furthermore, we characterized patterns of expression for a subset of these positively selected orthologs using qRT-PCR. Taken together, we identified candidate eye genes that show signatures of adaptive molecular evolution and provide evidence of theirmore »expression divergence between species, tissues, and sexes. Our results demonstrate: 1) greater evolutionary changes in younger Heliconius lineages, that is, more positively selected genes in the cydno–melpomene–hecale group as opposed to the sara–hortense–erato group, and 2) suggest an ancient hybridization leading to speciation among Heliconius pupal-mating species.« less