skip to main content


Title: homologizer: Phylogenetic phasing of gene copies into polyploid subgenomes
Abstract

Organisms such as allopolyploids and F1 hybrids contain multiple distinct subgenomes, each potentially with its own evolutionary history. These organisms present a challenge for multilocus phylogenetic inference and other analyses since it is not apparent which gene copies from different loci are from the same subgenome and thus share an evolutionary history.

Here we introduce homologizer, a flexible Bayesian approach that uses a phylogenetic framework to infer the phasing of gene copies across loci into their respective subgenomes.

Through the use of simulation tests, we demonstrate that homologizer is robust to a wide range of factors, such as incomplete lineage sorting and the phylogenetic informativeness of loci. Furthermore, we establish the utility of homologizer on real data, by analysing a multilocus dataset consisting of nine diploids and 19 tetraploids from the fern family Cystopteridaceae.

Finally, we describe how homologizer may potentially be used beyond its core phasing functionality to identify non‐homologous sequences, such as hidden paralogs or contaminants.

 
more » « less
Award ID(s):
1753800 1753673
NSF-PAR ID:
10401462
Author(s) / Creator(s):
 ;  ;  
Publisher / Repository:
Wiley-Blackwell
Date Published:
Journal Name:
Methods in Ecology and Evolution
Volume:
14
Issue:
5
ISSN:
2041-210X
Page Range / eLocation ID:
p. 1230-1244
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract— Like many fern lineages comprising reticulate species complexes, Polypodium s.s. (Polypodiacaeae) has a history shaped by rapid diversification, hybridization, and polyploidy that poses substantial challenges for phylogenetic inference with plastid and single-locus nuclear markers. Using target capture probes for 408 nuclear loci developed by the GoFlag project and a custom bioinformatic pipeline, SORTER, we constructed multi-locus nuclear datasets for diploid temperate and Mesoamerican species of Polypodium and five allotetraploid species belonging to the well-studied Polypodium vulgare complex. SORTER employs a clustering approach to separate putatively paralogous copies of targeted loci into orthologous matrices and haplotype phasing to infer allopolyploid haplotypes across loci, resulting in datasets amenable to both concatenated maximum likelihood and multi-species coalescent phylogenetic analyses. By comparing phylogenies derived from maximum likelihood and multi-species coalescent analyses of unphased and phased datasets, as well as evaluating discordance among gene trees and species trees, we recover support for incomplete lineage sorting within Polypodium s.s., novel relationships among diploid taxa of the Polypodium vulgare complex and its Mesoamerican sister clade, and the placement of several Polypodium species within other genera. Additionally, we were able to infer well-supported phylogenies that identified the hypothesized progenitors of the allotetraploid species, indicating that SORTER is an effective and accurate tool for reconstructing homeolog haplotypes of allopolyploids in fern taxa and other non-model organisms from target capture data. 
    more » « less
  2. Abstract

    The paleback darter,Etheostoma pallididorsum, is considered imperilled and has recently been petitioned for listing under the Endangered Species Act. Previous allozyme‐based studies found evidence of a small effective population size, warranting conservation concern. The objective of this study was to assess the population dynamics and the phylogeographical history of the paleback darter, using a multilocus microsatellite approach and mitochondrial DNA.

    The predictions of this study were that: paleback darter populations will exhibit low genetic diversity and minimal gene flow; population structure will correspond to the river systems from which the samples are derived; reservoir dams impounding the reaches between the Caddo and Ouachita rivers would serve as effective barriers to gene flow; and the Caddo and Ouachita rivers are reciprocally monophyletic.

    Microsatellite DNA loci revealed significant structure among sampled localities (globalFst= 0.17,P< 0.001), with evidence of two distinct populations representing the Caddo and Ouachita rivers. However, Bayesian phylogeographical analyses resulted in three distinct clades: Caddo River, Ouachita River, and Mazarn Creek. Divergence from the most recent ancestor shared among the river drainages was estimated at 60 Kya. Population genetic diversity was relatively low (He= 0.65; mean alleles per locus,A= 6.26), but was comparable with the population genetic diversity found in the close relatives slackwater darter,Etheostoma boschungi(He= 0.65;A= 6.74), and Tuscumbia darter,Etheostoma tuscumbia(He= 0.57;A= 5.53).

    These results have conservation implications for paleback darter populations and can be informative for other headwater specialist species. Like other headwater species with population structuring and relatively low genetic diversity, the persistence of paleback darter populations is likely to be tied to the persistence and connectivity of local breeding and non‐breeding habitat. These results do not raise conservation concern for a population decline; however, the restricted distribution and endemic status of the species still renders paleback darter populations vulnerable to extirpation or extinction.

     
    more » « less
  3. Summary

    Species in the genusSphagnumcreate, maintain, and dominate boreal peatlands through ‘extended phenotypes’ that allow these organisms to engineer peatland ecosystems and thereby impact global biogeochemical cycles. One such phenotype is the production of peat, or incompletely decomposed biomass, that accumulates when rates of growth exceed decomposition. Interspecific variation in peat production is thought to be responsible for the establishment and maintenance of ecological gradients such as the microtopographic hummock‐hollow gradient, along which sympatric species sort within communities.

    This study investigated the mode and tempo of functional trait evolution across 15 species ofSphagnumusing data from the most extensive studies ofSphagnumfunctional traits to date and phylogenetic comparative methods.

    We found evidence for phylogenetic conservatism of the niche descriptor height‐above‐water‐table and of traits related to growth, decay and litter quality. However, we failed to detect the influence of phylogeny on interspecific variation in other traits such as shoot density and suggest that environmental context can obscure phylogenetic signal. Trait correlations indicate possible adaptive syndromes that may relate to niche and its construction.

    This study is the first to formally test the extent to which functional trait variation amongSphagnumspecies is a result of shared evolutionary history.

     
    more » « less
  4. Abstract

    The effects of genetic introgression on species boundaries and how they affect species’ integrity and persistence over evolutionary time have received increased attention. The increasing availability of genomic data has revealed contrasting patterns of gene flow across genomic regions, which impose challenges to inferences of evolutionary relationships and of patterns of genetic admixture across lineages. By characterizing patterns of variation across thousands of genomic loci in a widespread complex of true toads (Rhinella), we assess the true extent of genetic introgression across species thought to hybridize to extreme degrees based on natural history observations and multilocus analyses. Comprehensive geographic sampling of five large‐ranged Neotropical taxa revealed multiple distinct evolutionary lineages that span large geographic areas and, at times, distinct biomes. The inferred major clades and genetic clusters largely correspond to currently recognized taxa; however, we also found evidence of cryptic diversity within taxa. While previous phylogenetic studies revealed extensive mitonuclear discordance, our genetic clustering analyses uncovered several admixed individuals within major genetic groups. Accordingly, historical demographic analyses supported that the evolutionary history of these toads involved cross‐taxon gene flow both at ancient and recent times. Lastly, ABBA‐BABA tests revealed widespread allele sharing across species boundaries, a pattern that can be confidently attributed to genetic introgression as opposed to incomplete lineage sorting. These results confirm previous assertions that the evolutionary history ofRhinellawas characterized by various levels of hybridization even across environmentally heterogeneous regions, posing exciting questions about what factors prevent complete fusion of diverging yet highly interdependent evolutionary lineages.

     
    more » « less
  5. Summary

    The tree of life is highly reticulate, with the history of population divergence emerging from populations of gene phylogenies that reflect histories of introgression, lineage sorting and divergence. In this study, we investigate global patterns of oak diversity and test the hypothesis that there are regions of the oak genome that are broadly informative about phylogeny.

    We utilize fossil data and restriction‐site associatedDNAsequencing (RAD‐seq) for 632 individuals representing nearly 250Quercusspecies to infer a time‐calibrated phylogeny of the world's oaks. We use a reversible‐jump Markov chain Monte Carlo method to reconstruct shifts in lineage diversification rates, accounting for among‐clade sampling biases. We then map the > 20 000RAD‐seq loci back to an annotated oak genome and investigate genomic distribution of introgression and phylogenetic support across the phylogeny.

    Oak lineages have diversified among geographic regions, followed by ecological divergence within regions, in the Americas and Eurasia. Roughly 60% of oak diversity traces back to four clades that experienced increases in net diversification, probably in response to climatic transitions or ecological opportunity.

    The strong support for the phylogeny contrasts with high genomic heterogeneity in phylogenetic signal and introgression. Oaks are phylogenomic mosaics, and their diversity may in fact depend on the gene flow that shapes the oak genome.

     
    more » « less