skip to main content


Title: A comparative genomics multitool for scientific discovery and conservation
The Zoonomia Project is investigating the genomics of shared and specialized traits in eutherian mammals. Here we provide genome assemblies for 131 species, of which all but 9 are previously uncharacterized, and describe a whole-genome alignment of 240 species of considerable phylogenetic diversity, comprising representatives from more than 80% of mammalian families. We find that regions of reduced genetic diversity are more abundant in species at a high risk of extinction, discern signals of evolutionary selection at high resolution and provide insights from individual reference genomes. By prioritizing phylogenetic diversity and making data available quickly and without restriction, the Zoonomia Project aims to support biological discovery, medical research and the conservation of biodiversity.  more » « less
Award ID(s):
2029774 1753760 1838283
NSF-PAR ID:
10248922
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; more » ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; « less
Date Published:
Journal Name:
Nature
Volume:
587
Issue:
7833
ISSN:
0028-0836
Page Range / eLocation ID:
240 to 245
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Polyploidy is widely acknowledged to have played an important role in the evolution and diversification of vascular plants. However, the influence of genome duplication on population-level dynamics and its cascading effects at the community level remain unclear. In part, this is due to persistent uncertainties over the extent of polyploid phenotypic variation, and the interactions between polyploids and co-occurring species, and highlights the need to integrate polyploid research at the population and community level. Here, we investigate how community-level patterns of phylogenetic relatedness might influence escape from minority cytotype exclusion, a classic population genetics hypothesis about polyploid establishment, and population-level species interactions. Focusing on two plant families in which polyploidy has evolved multiple times, Brassicaceae and Rosaceae, we build upon the hypothesis that the greater allelic and phenotypic diversity of polyploids allow them to successfully inhabit a different geographic range compared to their diploid progenitor and close relatives. Using a phylogenetic framework, we specifically test (1) whether polyploid species are more distantly related to diploids within the same community than co-occurring diploids are to one another, and (2) if polyploid species tend to exhibit greater ecological success than diploids, using species abundance in communities as an indicator of successful establishment. Overall, our results suggest that the effects of genome duplication on community structure are not clear-cut. We find that polyploid species tend to be more distantly related to co-occurring diploids than diploids are to each other. However, we do not find a consistent pattern of polyploid species being more abundant than diploid species, suggesting polyploids are not uniformly more ecologically successful than diploids. While polyploidy appears to have some important influences on species co-occurrence in Brassicaceae and Rosaceae communities, our study highlights the paucity of available geographically explicit data on intraspecific ploidal variation. The increased use of high-throughput methods to identify ploidal variation, such as flow cytometry and whole genome sequencing, will greatly aid our understanding of how such a widespread, radical genomic mutation influences the evolution of species and those around them. 
    more » « less
  2. The subphylum Saccharomycotina is a lineage in the fungal phylum Ascomycota that exhibits levels of genomic diversity similar to those of plants and animals. The Saccharomycotina consist of more than 1 200 known species currently divided into 16 families, one order, and one class. Species in this subphylum are ecologically and metabolically diverse and include important opportunistic human pathogens, as well as species important in biotechnological applications. Many traits of biotechnological interest are found in closely related species and often restricted to single phylogenetic clades. However, the biotechnological potential of most yeast species remains unexplored. Although the subphylum Saccharomycotina has much higher rates of genome sequence evolution than its sister subphylum, Pezizomycotina , it contains only one class compared to the 16 classes in Pezizomycotina . The third subphylum of Ascomycota , the Taphrinomycotina , consists of six classes and has approximately 10 times fewer species than the Saccharomycotina . These data indicate that the current classification of all these yeasts into a single class and a single order is an underappreciation of their diversity. Our previous genome-scale phylogenetic analyses showed that the Saccharomycotina contains 12 major and robustly supported phylogenetic clades; seven of these are current families ( Lipomycetaceae , Trigonopsidaceae , Alloascoideaceae , Pichiaceae , Phaffomycetaceae , Saccharomycodaceae , and Saccharomycetaceae ), one comprises two current families ( Dipodascaceae and Trichomonascaceae ), one represents the genus Sporopachydermia , and three represent lineages that differ in their translation of the CUG codon (CUG-Ala, CUG-Ser1, and CUG-Ser2). Using these analyses in combination with relative evolutionary divergence and genome content analyses, we propose an updated classification for the Saccharomycotina , including seven classes and 12 orders that can be diagnosed by genome content. This updated classification is consistent with the high levels of genomic diversity within this subphylum and is necessary to make the higher rank classification of the Saccharomycotina more comparable to that of other fungi, as well as to communicate efficiently on lineages that are not yet formally named. 
    more » « less
  3. Sethuraman, Arun (Ed.)
    Abstract

    Damselflies and dragonflies (Order: Odonata) play important roles in both aquatic and terrestrial food webs and can serve as sentinels of ecosystem health and predictors of population trends in other taxa. The habitat requirements and limited dispersal of lotic damselflies make them especially sensitive to habitat loss and fragmentation. As such, landscape genomic studies of these taxa can help focus conservation efforts on watersheds with high levels of genetic diversity, local adaptation, and even cryptic endemism. Here, as part of the California Conservation Genomics Project (CCGP), we report the first reference genome for the American rubyspot damselfly, Hetaerina americana, a species associated with springs, streams and rivers throughout California. Following the CCGP assembly pipeline, we produced two de novo genome assemblies. The primary assembly includes 1,630,044,487 base pairs, with a contig N50 of 5.4 Mb, a scaffold N50 of 86.2 Mb, and a BUSCO completeness score of 97.6%. This is the seventh Odonata genome to be made publicly available and the first for the subfamily Hetaerininae. This reference genome fills an important phylogenetic gap in our understanding of Odonata genome evolution, and provides a genomic resource for a host of interesting ecological, evolutionary, and conservation questions for which the rubyspot damselfly genus Hetaerina is an important model system.

     
    more » « less
  4. Ferns are the second largest clade of vascular plants with over 10,000 species, yet the generation of genomic resources for the group has lagged behind other major clades of plants. Transcriptomic data have proven to be a powerful tool to assess phylogenetic relationships, using thousands of markers that are largely conserved across the genome, and without the need to sequence entire genomes. We assembled the largest nuclear phylogenetic dataset for ferns to date, including 2884 single-copy nuclear loci from 247 transcriptomes (242 ferns, five outgroups), and investigated phylogenetic relationships across the fern tree, the placement of whole genome duplications (WGDs), and gene retention patterns following WGDs. We generated a well-supported phylogeny of ferns and identified several regions of the fern phylogeny that demonstrate high levels of gene tree–species tree conflict, which largely correspond to areas of the phylogeny that have been difficult to resolve. Using a combination of approaches, we identified 27 WGDs across the phylogeny, including 18 large-scale events (involving more than one sampled taxon) and nine small-scale events (involving only one sampled taxon). Most inferred WGDs occur within single lineages (e.g., orders, families) rather than on the backbone of the phylogeny, although two inferred events are shared by leptosporangiate ferns (excluding Osmundales) and Polypodiales (excluding Lindsaeineae and Saccolomatineae), clades which correspond to the majority of fern diversity. We further examined how retained duplicates following WGDs compared across independent events and found that functions of retained genes were largely convergent, with processes involved in binding, responses to stimuli, and certain organelles over-represented in paralogs while processes involved in transport, organelles derived from endosymbiotic events, and signaling were under-represented. To date, our study is the most comprehensive investigation of the nuclear fern phylogeny, though several avenues for future research remain unexplored. 
    more » « less
  5. Abstract

    Exon markers have a long history of use in phylogenetics of ray‐finned fishes, the most diverse clade of vertebrates with more than 35,000 species. As the number of published genomes increases, it has become easier to test exons and other genetic markers for signals of ancient duplication events and filter out paralogues that can mislead phylogenetic analysis. We present seven new probe sets for current target‐capture phylogenomic protocols that capture 1,104 exons explicitly filtered for paralogues using gene trees. These seven probe sets span the diversity of teleost fishes, including four sets that target five hyperdiverse percomorph clades which together comprise ca. 17,000 species (Carangaria, Ovalentaria, Eupercaria, and Syngnatharia + Pelagiaria combined). We additionally included probes to capture legacy nuclear exons and mitochondrial markers that have been commonly used in fish phylogenetics (despite some exons being flagged for paralogues) to facilitate integration of old and new molecular phylogenetic matrices. We tested these probes experimentally for 56 fish species (eight species per probe set) and merged new exon‐capture sequence data into an existing data matrix of 1,104 exons and 300 ray‐finned fish species. We provide an optimized bioinformatics pipeline to assemble exon capture data from raw reads to alignments for downstream analysis. We show that legacy loci with known paralogues are at risk of assembling duplicated sequences with target‐capture, but we also assembled many useful orthologous sequences that can be integrated with many PCR‐generated matrices. These probe sets are a valuable resource for advancing fish phylogenomics because targeted exons can easily be extracted from increasingly available whole genome and transcriptome data sets, and also may be integrated with existing PCR‐based exon and mitochondrial data.

     
    more » « less