skip to main content


Title: Unifying Phylogenetic Birth–Death Models in Epidemiology and Macroevolution
Abstract Birth–death stochastic processes are the foundations of many phylogenetic models and are widely used to make inferences about epidemiological and macroevolutionary dynamics. There are a large number of birth–death model variants that have been developed; these impose different assumptions about the temporal dynamics of the parameters and about the sampling process. As each of these variants was individually derived, it has been difficult to understand the relationships between them as well as their precise biological and mathematical assumptions. Without a common mathematical foundation, deriving new models is nontrivial. Here, we unify these models into a single framework, prove that many previously developed epidemiological and macroevolutionary models are all special cases of a more general model, and illustrate the connections between these variants. This unification includes both models where the process is the same for all lineages and those in which it varies across types. We also outline a straightforward procedure for deriving likelihood functions for arbitrarily complex birth–death(-sampling) models that will hopefully allow researchers to explore a wider array of scenarios than was previously possible. By rederiving existing single-type birth–death sampling models, we clarify and synthesize the range of explicit and implicit assumptions made by these models. [Birth–death processes; epidemiology; macroevolution; phylogenetics; statistical inference.]  more » « less
Award ID(s):
2028986
PAR ID:
10296290
Author(s) / Creator(s):
; ; ; ;
Editor(s):
Albert, James
Date Published:
Journal Name:
Systematic Biology
ISSN:
1063-5157
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Crandall, Keith (Ed.)
    Abstract Viral phylogenies provide crucial information on the spread of infectious diseases, and many studies fit mathematical models to phylogenetic data to estimate epidemiological parameters such as the effective reproduction ratio (Re) over time. Such phylodynamic inferences often complement or even substitute for conventional surveillance data, particularly when sampling is poor or delayed. It remains generally unknown, however, how robust phylodynamic epidemiological inferences are, especially when there is uncertainty regarding pathogen prevalence and sampling intensity. Here, we use recently developed mathematical techniques to fully characterize the information that can possibly be extracted from serially collected viral phylogenetic data, in the context of the commonly used birth-death-sampling model. We show that for any candidate epidemiological scenario, there exists a myriad of alternative, markedly different, and yet plausible “congruent” scenarios that cannot be distinguished using phylogenetic data alone, no matter how large the data set. In the absence of strong constraints or rate priors across the entire study period, neither maximum-likelihood fitting nor Bayesian inference can reliably reconstruct the true epidemiological dynamics from phylogenetic data alone; rather, estimators can only converge to the “congruence class” of the true dynamics. We propose concrete and feasible strategies for making more robust epidemiological inferences from viral phylogenetic data. 
    more » « less
  2. Abstract

    Gene flow is increasingly recognized as an important macroevolutionary process. The many mechanisms that contribute to gene flow (e.g. introgression, hybridization, lateral gene transfer) uniquely affect the diversification of dynamics of species, making it important to be able to account for these idiosyncrasies when constructing phylogenetic models. Existing phylogenetic‐network simulators for macroevolution are limited in the ways they model gene flow.

    We presentSiPhyNetwork, an R package for simulating phylogenetic networks under a birth–death‐hybridization process.

    Our package unifies the existing birth–death‐hybridization models while also extending the toolkit for modelling gene flow. This tool can create patterns of reticulation such as hybridization, lateral gene transfer, and introgression.

    Specifically, we model different reticulate events by allowing events to either add, remove or keep constant the number of lineages. Additionally, we allow reticulation events to be trait dependent, creating the ability to model the expanse of isolating mechanisms that prevent gene flow. This tool makes it possible for researchers to model many of the complex biological factors associated with gene flow in a phylogenetic context.

     
    more » « less
  3. In a striking result, Louca and Pennell [S. Louca, M. W. Pennell, Nature 580, 502–505 (2020)] recently proved that a large class of phylogenetic birth–death models is statistically unidentifiable from lineage-through-time (LTT) data: Any pair of sufficiently smooth birth and death rate functions is “congruent” to an infinite collection of other rate functions, all of which have the same likelihood for any LTT vector of any dimension. As Louca and Pennell argue, this fact has distressing implications for the thousands of studies that have utilized birth–death models to study evolution. In this paper, we qualify their finding by proving that an alternative and widely used class of birth–death models is indeed identifiable. Specifically, we show that piecewise constant birth–death models can, in principle, be consistently estimated and distinguished from one another, given a sufficiently large extant timetree and some knowledge of the present-day population. Subject to mild regularity conditions, we further show that any unidentifiable birth–death model class can be arbitrarily closely approximated by a class of identifiable models. The sampling requirements needed for our results to hold are explicit and are expected to be satisfied in many contexts such as the phylodynamic analysis of a global pandemic. 
    more » « less
  4. Abstract

    Biologists have long sought to quantify the number of species on Earth. Often missing from these efforts is the contribution of microorganisms, the smallest but most abundant form of life on the planet. Despite recent large‐scale sampling efforts, estimates of global microbial diversity span many orders of magnitude. It is important to consider how speciation and extinction over the last 4 billion years constrain inventories of biodiversity. We parameterized macroevolutionary models based on birth–death processes that assume constant and universal speciation and extinction rates. The models reveal that richness beyond 1012species is feasible and in agreement with empirical predictions. Additional simulations suggest that mass extinction events do not place hard limits on modern‐day microbial diversity. Together, our study provides independent support for a massive global‐scale microbiome while shedding light on the upper limits of life on Earth.

     
    more » « less
  5. Stochastic models that incorporate birth, death and immigration (also called birth–death and innovation models) are ubiquitous and applicable to many problems such as quantifying species sizes in ecological populations, describing gene family sizes, modeling lymphocyte evolution in the body. Many of these applications involve the immigration of new species into the system. We consider the full high-dimensional stochastic process associated with multispecies birth–death–immigration and present a number of exact and asymptotic results at steady state.We further include random mutations or interactions through a carrying capacity and find the statistics of the total number of individuals, the total number of species, the species size distribution, and various diversity indices. Our results include a rigorous analysis of the behavior of these systems in the fast immigration limit which shows that of the different diversity indices, the species richness is best able to distinguish different types of birth–death–immigration models. We also find that detailed balance is preserved in the simple noninteracting birth–death–immigration model and the birth–death–immigration model with carrying capacity implemented through death. Surprisingly, when carrying capacity is implemented through the birth rate, detailed balance is violated. 
    more » « less