skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Diversification Models Conflate Likelihood and Prior, and Cannot be Compared Using Conventional Model-Comparison Tools
Abstract Time-calibrated phylogenetic trees are a tremendously powerful tool for studying evolutionary, ecological, and epidemiological phenomena. Such trees are predominantly inferred in a Bayesian framework, with the phylogeny itself treated as a parameter with a prior distribution (a “tree prior”). However, we show that the tree “parameter” consists, in part, of data, in the form of taxon samples. Treating the tree as a parameter fails to account for these data and compromises our ability to compare among models using standard techniques (e.g., marginal likelihoods estimated using path-sampling and stepping-stone sampling algorithms). Since accuracy of the inferred phylogeny strongly depends on how well the tree prior approximates the true diversification process that gave rise to the tree, the inability to accurately compare competing tree priors has broad implications for applications based on time-calibrated trees. We outline potential remedies to this problem, and provide guidance for researchers interested in assessing the fit of tree models. [Bayes factors; Bayesian model comparison; birth-death models; divergence-time estimation; lineage diversification]  more » « less
Award ID(s):
1754705
PAR ID:
10407442
Author(s) / Creator(s):
; ;
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
Systematic Biology
ISSN:
1063-5157
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. PremiseRecent advances in generating large‐scale phylogenies enable broad‐scale estimation of species diversification. These now common approaches typically are characterized by (1) incomplete species coverage without explicit sampling methodologies and/or (2) sparse backbone representation, and usually rely on presumed phylogenetic placements to account for species without molecular data. We used empirical examples to examine the effects of incomplete sampling on diversification estimation and provide constructive suggestions to ecologists and evolutionary biologists based on those results. MethodsWe used a supermatrix for rosids and one well‐sampled subclade (Cucurbitaceae) as empirical case studies. We compared results using these large phylogenies with those based on a previously inferred, smaller supermatrix and on a synthetic tree resource with complete taxonomic coverage. Finally, we simulated random and representative taxon sampling and explored the impact of sampling on three commonly used methods, both parametric (RPANDA and BAMM) and semiparametric (DR). ResultsWe found that the impact of sampling on diversification estimates was idiosyncratic and often strong. Compared to full empirical sampling, representative and random sampling schemes either depressed or inflated speciation rates, depending on methods and sampling schemes. No method was entirely robust to poor sampling, but BAMM was least sensitive to moderate levels of missing taxa. ConclusionsWe suggest caution against uncritical modeling of missing taxa using taxonomic data for poorly sampled trees and in the use of summary backbone trees and other data sets with high representative bias, and we stress the importance of explicit sampling methodologies in macroevolutionary studies. 
    more » « less
  2. Abstract Hillieae is a group of ∼30 florally diverse, Neotropical epiphyte species. Species richness peaks in southern Central America and taxa display bat, hawkmoth, or hummingbird pollination syndromes. A phylogenetic framework is needed to understand floral and biogeographic evolution. We used target enrichment data to infer a species tree and a Bayesian time-calibrated tree including ∼83% of the species in the group. We inferred ancestral biogeography and pollination syndromes, described species’ realized bioclimatic niches via a principal component analysis, and estimated significant niche shifts using Ornstein–Uhlenbeck models to understand how different abiotic and biotic variables have shaped Hillieae evolution. We estimated that Hillieae originated in southern Central America 19 Ma and that hawkmoth pollination is the ancestral character state. Multiple independent shifts in pollination syndrome, biogeographic distribution, and realized bioclimatic niche have occurred, though bioclimatic niche is largely conserved. Using generalized linear models, we identify two interactions—between species’ biogeographic ranges and pollination syndromes, and between phylogenetic covariance and pollination syndromes—that additively affect the degree of bioclimatic niche overlap between species. Regional variation in pollination syndrome diversity and patterns of species bioclimatic niche overlap indicate a link between biogeography and species ecology in driving Hillieae diversification and syndrome evolution. 
    more » « less
  3. Analyses of evolutionary dynamics depend on how phylogenetic data are time-scaled. Most analyses of extant taxa assume a purely bifurcating model, where nodes are calibrated using the daughter lineage with the older first occurrence in the fossil record. This contrasts with budding, where nodes are calibrated using the younger first occurrence. Here, we use the extensive fossil record of bivalve molluscs for a large-scale evaluation of how branching models affect macroevolutionary analyses. We time-calibrated 91% of nodes, ranging in age from 2.59 to 485 Ma, in a phylogeny of 97 extant bivalve families. Allowing budding-based calibrations minimizes conflict between the tree and observed fossil record, and reduces the summed duration of inferred ‘ghost lineages’ from 6.76 billion years (Gyr; bifurcating model) to 1.00 Gyr (budding). Adding 31 extinct paraphyletic families raises ghost lineage totals to 7.86 Gyr (bifurcating) and 1.92 Gyr (budding), but incorporates more information to date divergences between lineages. Macroevolutionary analyses under a bifurcating model conflict with other palaeontological evidence on the magnitude of the end-Palaeozoic extinction, and strongly reduce Cenozoic diversification. Consideration of different branching models is essential when node-calibrating phylogenies, and for a major clade with a robust fossil record, a budding model appears more appropriate. 
    more » « less
  4. null (Ed.)
    Abstract Motivation Precise time calibrations needed to estimate ages of species divergence are not always available due to fossil records' incompleteness. Consequently, clock calibrations available for Bayesian dating analyses can be few and diffused, i.e. phylogenies are calibration-poor, impeding reliable inference of the timetree of life. We examined the role of speciation birth–death (BD) tree prior on Bayesian node age estimates in calibration-poor phylogenies and tested the usefulness of an informative, data-driven tree prior to enhancing the accuracy and precision of estimated times. Results We present a simple method to estimate parameters of the BD tree prior from the molecular phylogeny for use in Bayesian dating analyses. The use of a data-driven birth–death (ddBD) tree prior leads to improvement in Bayesian node age estimates for calibration-poor phylogenies. We show that the ddBD tree prior, along with only a few well-constrained calibrations, can produce excellent node ages and credibility intervals, whereas the use of an uninformative, uniform (flat) tree prior may require more calibrations. Relaxed clock dating with ddBD tree prior also produced better results than a flat tree prior when using diffused node calibrations. We also suggest using ddBD tree priors to improve the detection of outliers and influential calibrations in cross-validation analyses. These results have practical applications because the ddBD tree prior reduces the number of well-constrained calibrations necessary to obtain reliable node age estimates. This would help address key impediments in building the grand timetree of life, revealing the process of speciation and elucidating the dynamics of biological diversification. Availability and implementation An R module for computing the ddBD tree prior, simulated datasets and empirical datasets are available at https://github.com/cathyqqtao/ddBD-tree-prior. 
    more » « less
  5. Abstract Genomic data continue to advance our understanding of species limits and biogeographic patterns. However, there is still no consensus regarding appropriate methods of phylogenomic analysis that make the best use of these heterogeneous data sets. In this study, we used thousands of ultraconserved element (UCE) loci from alligator lizards in the genus Gerrhonotus to compare and contrast species trees inferred using multiple contemporary methods and provide a time frame for biological diversification across the Mexican Transition Zone (MTZ). Concatenated maximum likelihood (ML) and Bayesian analyses provided highly congruent results, with differences limited to poorly supported nodes. Similar topologies were inferred from coalescent analyses in Bayesian Phylogenetics and Phylogeography and SVDquartets, albeit with lower support for some nodes. All divergence times fell within the Miocene, linking speciation to local Neogene vicariance and/or global cooling trends following the mid-Miocene Climatic Optimum. We detected a high level of genomic divergence for a morphologically distinct species restricted to the arid mountains of north-eastern Mexico, and erected a new genus to better reflect evolutionary history. In summary, our results further advocate leveraging the strengths and weaknesses of concatenation and coalescent methods, provide evidence for old divergences for alligator lizards, and indicate that the MTZ continues to harbour substantial unrecognized diversity. 
    more » « less