Polyploidy, or whole-genome duplication, is expected to confound the inference of species trees with phyloge- netic methods for two reasons. First, the presence of retained duplicated genes requires the reconciliation of the inferred gene trees to a proposed species tree. Second, even if the analyses are restricted to shared single copy genes, the occurrence of reciprocal gene loss, where the surviving genes in different species are paralogs from the polyploidy rather than orthologs, will mean that such genes will not have evolved under the corresponding species tree and may not produce gene trees that allow inference of that species tree. Here we analyze three different ancient polyploidy events, using synteny-based inferences of orthology and paralogy to infer gene trees from nearly 17,000 sets of homologous genes. We find that the simple use of single copy genes from polyploid organisms provides reasonably robust phylogenetic signals, despite the presence of reciprocal gene losses. Such gene trees are also most often in accord with the inferred species relationships inferred from maximum likelihood models of gene loss after polyploidy: a completely distinct phylogenetic signal present in these genomes. As seen in other studies, however, we find that methods for inferring phylogenetic confidence yield high support values even in cases where the underlying data suggest meaningful conflict in the phylogenetic signals.
more »
« less
Convergent evolution of polyploid genomes from across the eukaryotic tree of life
Abstract By modeling the homoeologous gene losses that occurred in 50 genomes deriving from ten distinct polyploidy events, we show that the evolutionary forces acting on polyploids are remarkably similar, regardless of whether they occur in flowering plants, ciliates, fishes, or yeasts. We show that many of the events show a relative rate of duplicate gene loss before the first postpolyploidy speciation that is significantly higher than in later phases of their evolution. The relatively weak selective constraint experienced by the single-copy genes these losses produced leads us to suggest that most of the purely selectively neutral duplicate gene losses occur in the immediate postpolyploid period. Nearly all of the events show strong evidence of biases in the duplicate losses, consistent with them being allopolyploidies, with 2 distinct progenitors contributing to the modern species. We also find ongoing and extensive reciprocal gene losses (alternative losses of duplicated ancestral genes) between these genomes. With the exception of a handful of closely related taxa, all of these polyploid organisms are separated from each other by tens to thousands of reciprocal gene losses. As a result, it is very unlikely that viable diploid hybrid species could form between these taxa, since matings between such hybrids would tend to produce offspring lacking essential genes. It is, therefore, possible that the relatively high frequency of recurrent polyploidies in some lineages may be due to the ability of new polyploidies to bypass reciprocal gene loss barriers.
more »
« less
- Award ID(s):
- 1754142
- PAR ID:
- 10543350
- Publisher / Repository:
- Oxford University Press
- Date Published:
- Journal Name:
- G3 Genes|Genomes|Genetics
- Volume:
- 12
- Issue:
- 6
- ISSN:
- 2160-1836
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Zhang, Jianzhi (Ed.)Hybridization coupled to polyploidy, or allopolyploidy, has dramatically shaped the evolution of flowering plants, teleost fishes, and other lineages. Studies of recently formed allopolyploid plants have shown that the two subgenomes that merged to form that new allopolyploid do not generally express their genes equally. Instead, one of the two subgenomes expresses its paralogs more highly on average. Meanwhile, older allopolyploidy events tend to show biases in duplicate losses, with one of the two subgenomes retaining more genes than the other. Since reduced expression is a pathway to duplicate loss, understanding the origins of expression biases may help explain the origins of biased losses. Because we expect gene expression levels to experience stabilizing selection, our conceptual frameworks for how allopolyploid organisms form tend to assume that the new allopolyploid will show balanced expression between its subgenomes. It is then necessary to invoke phenomena such as differences in the suppression of repetitive elements to explain the observed expression imbalances. Here we show that, even for phenotypically identical diploid progenitors, the inherent kinetics of gene expression give rise to biases between the expression levels of the progenitor genes in the hybrid. Some of these biases are expected to be gene-specific and not give rise to global differences in progenitor gene expression. However, particularly in the case of allopolyploids formed from progenitors with different genome sizes, global expression biases favoring one subgenome are expected immediately on formation. Hence, expression biases are arguably the expectation upon allopolyploid formation rather than a phenomenon needing explanation. In the future, a deeper understanding of the kinetics of allopolyploidy may allow us to better understand both biases in duplicate losses and hybrid vigor.more » « less
-
Kolodny, Rachel (Ed.)Phylogenomic studies of prokaryotic taxa often assume conserved marker genes are homologous across their length. However, processes such as horizontal gene transfer or gene duplication and loss may disrupt this homology by recombining only parts of genes, causing gene fission or fusion. We show using simulation that it is necessary to delineate homology groups in a set of bacterial genomes without relying on gene annotations to define the boundaries of homologous regions. To solve this problem, we have developed a graph-based algorithm to partition a set of bacterial genomes into Maximal Homologous Groups of sequences ( MHGs ) where each MHG is a maximal set of maximum-length sequences which are homologous across the entire sequence alignment. We applied our algorithm to a dataset of 19 Enterobacteriaceae species and found that MHGs cover much greater proportions of genomes than markers and, relatedly, are less biased in terms of the functions of the genes they cover. We zoomed in on the correlation between each individual marker and their overlapping MHGs, and show that few phylogenetic splits supported by the markers are supported by the MHGs while many marker-supported splits are contradicted by the MHGs. A comparison of the species tree inferred from marker genes with the species tree inferred from MHGs suggests that the increased bias and lack of genome coverage by markers causes incorrect inferences as to the overall relationship between bacterial taxa.more » « less
-
Piganeau, Gwenael (Ed.)Abstract Numerous factors shape the evolution of protein-coding genes, including shifts in the strength or type of selection following gene duplications or changes in the environment. Diatoms and other silicifying organisms use a family of silicon transporters (SITs) to import dissolved silicon from the environment. Freshwaters contain higher silicon levels than oceans, and marine diatoms have more efficient uptake kinetics and less silicon in their cell walls, making them better competitors for a scarce resource. We compiled SITs from 37 diatom genomes to characterize shifts in selection following gene duplications and marine–freshwater transitions. A deep gene duplication, which coincided with a whole-genome duplication, gave rise to two gene lineages. One of them (SIT1–2) is present in multiple copies in most species and is known to actively import silicon. These SITs have evolved under strong purifying selection that was relaxed in freshwater taxa. Episodic diversifying selection was detected but not associated with gene duplications or habitat shifts. In contrast, genes in the second SIT lineage (SIT3) were present in just half the species, the result of multiple losses. Despite conservation of SIT3 in some lineages for the past 90–100 million years, repeated losses, relaxed selection, and low expression highlighted the dispensability of SIT3, consistent with a model of deterioration and eventual loss due to relaxed selection on SIT3 expression. The extensive but relatively balanced history of duplications and losses, together with paralog-specific expression patterns, suggest diatoms continuously balance gene dosage and expression dynamics to optimize silicon transport across major environmental gradients.more » « less
-
Abstract Oxygen deficient zones (ODZs) account for about 30% of total oceanic fixed nitrogen loss via processes including denitrification, a microbially mediated pathway proceeding stepwise from NO3− to N2. This process may be performed entirely by complete denitrifiers capable of all four enzymatic steps, but many organisms possess only partial denitrification pathways, either producing or consuming key intermediates such as the greenhouse gas N2O. Metagenomics and marker gene surveys have revealed a diversity of denitrification genes within ODZs, but whether these genes co-occur within complete or partial denitrifiers and the identities of denitrifying taxa remain open questions. We assemble genomes from metagenomes spanning the ETNP and Arabian Sea, and map these metagenome-assembled genomes (MAGs) to 56 metagenomes from all three major ODZs to reveal the predominance of partial denitrifiers, particularly single-step denitrifiers. We find niche differentiation among nitrogen-cycling organisms, with communities performing each nitrogen transformation distinct in taxonomic identity and motility traits. Our collection of 962 MAGs presents the largest collection of pelagic ODZ microorganisms and reveals a clearer picture of the nitrogen cycling community within this environment.more » « less
An official website of the United States government
