skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Predicting parallelism and quantifying divergence in experimental evolution
The degree that the environment determines what genes contribute towards adaptation is a fundamental question in microbial evolution. Microbial populations are often experimentally passaged in different environments and sequenced in order to identify candidates for adaptation in a particular environment. However, there remains the need to develop an appropriate statistical framework to identify genes that acquired more mutations in one environment over the other (i.e., divergent evolution). Here we demonstrate how the evolutionary outcomes among replicate populations in the same environment, known as parallel evolution, can be leveraged to construct an intuitive statistical test for identifying the genes that contribute towards divergent evolution. To accomplish this task, we examined publicly available evolve-and-resequence experiment datasets and found that the distribution of mutation counts among genes can be predicted using an ensemble of independent Poisson random variables. Building on this result, we propose that the degree of divergent evolution at a given gene between populations from two different environments can be modeled as the difference between two Poisson random variables, known as the Skellam distribution. We then propose and apply a statistical test to identify specific genes that contribute towards divergent evolution. IMPORTANCE: There is currently no existing framework that can be leveraged to identify genes that contribute towards divergent evolution in microbial evolution experiments. To correct for this absence, we investigated the distribution of mutation counts among genes in order to identify an appropriate null model. Our observations suggest that divergent evolution within a given gene can be modeled as the difference in the total number of mutations observed between two environments. This quantity is described by a probability distribution known as the Skellam distribution, providing an appropriate statistical test for researchers seeking to identify the set of genes that contribute towards divergent evolution in evolution experiments.  more » « less
Award ID(s):
1934554
PAR ID:
10313287
Author(s) / Creator(s):
Date Published:
Journal Name:
bioRxiv
ISSN:
2692-8205
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Imperiale, Michael J. (Ed.)
    ABSTRACT The degree to which independent populations subjected to identical environmental conditions evolve in similar ways is a fundamental question in evolution. To address this question, microbial populations are often experimentally passaged in a given environment and sequenced to examine the tendency for similar mutations to repeatedly arise. However, there remains the need to develop an appropriate statistical framework to identify genes that acquired more mutations in one environment than in another (i.e., divergent evolution), genes that serve as genetic candidates of adaptation. Here, we develop a mathematical model to evaluate evolutionary outcomes among replicate populations in the same environment (i.e., parallel evolution), which can then be used to identify genes that contribute to divergent evolution. Applying this approach to data sets from evolve-and-resequence experiments, we found that the distribution of mutation counts among genes can be predicted as an ensemble of independent Poisson random variables with zero free parameters. Building on this result, we propose that the degree of divergent evolution at a given gene between populations from two different environments can be modeled as the difference between two Poisson random variables, known as the Skellam distribution. We then propose and apply a statistical test to identify specific genes that contribute to divergent evolution. By focusing on predicting patterns among replicate populations in a given environment, we are able to identify an appropriate test for divergence between environments that is grounded in first principles. IMPORTANCE There is currently no universally accepted framework for identifying genes that contribute to molecular divergence between microbial populations in different environments. To address this absence, we developed a null model to describe the distribution of mutation counts among genes. We find that divergent evolution within a given gene can be modeled as the absolute difference in the total number of mutations observed between two environments. This quantity is effectively captured by a probability distribution known as the Skellam distribution, providing an appropriate statistical test for researchers seeking to identify the set of genes that contribute to divergent evolution in microbial evolution experiments. 
    more » « less
  2. Wittkopp, Patricia (Ed.)
    Abstract Populations of Escherichia coli selected in constant and fluctuating environments containing lactose often adapt by substituting mutations in the lacI repressor that cause constitutive expression of the lac operon. These mutations occur at a high rate and provide a significant benefit. Despite this, eight of 24 populations evolved for 8,000 generations in environments containing lactose contained no detectable repressor mutations. We report here on the basis of this observation. We find that, given relevant mutation rates, repressor mutations are expected to have fixed in all evolved populations if they had maintained the same fitness effect they confer when introduced to the ancestor. In fact, reconstruction experiments demonstrate that repressor mutations have become neutral or deleterious in those populations in which they were not detectable. Populations not fixing repressor mutations nevertheless reached the same fitness as those that did fix them, indicating that they followed an alternative evolutionary path that made redundant the potential benefit of the repressor mutation, but involved unique mutations of equivalent benefit. We identify a mutation occurring in the promoter region of the uspB gene as a candidate for influencing the selective choice between these paths. Our results detail an example of historical contingency leading to divergent evolutionary outcomes. 
    more » « less
  3. Gao, Beile (Ed.)
    ABSTRACT Escherichia coli can survive for long periods in batch culture in the laboratory, where they experience a stressful and heterogeneous environment. During this incubation, E. coli acquires mutations that are selected in response to this environment, ultimately leading to evolved populations that are better adapted to these complex conditions, which can lead to a better understanding of evolutionary mechanisms. Mutations in regulatory genes often play a role in adapting to heterogeneous environments. To identify such mutations, we examined transcriptional differences during log phase growth in unaged cells compared to those that had been aged for 10 days and regrown. We identified expression changes in genes involved in motility and chemotaxis after adaptation to long-term cultures. We hypothesized that aged populations would also have phenotypic changes in motility and that motility may play a role in survival and adaptation to long-term cultures. While aged populations did show an increase in motility, this increase was not essential for survival in long-term cultures. We identified mutations in the regulatory gene sspA and other genes that may contribute to the observed differences in motility. Taken together, these data provide an overall picture of the role of mutations in regulatory genes for adaptation while underscoring that all changes that occur during evolution in stressful environments are not necessarily adaptive. IMPORTANCE Understanding how bacteria adapt in long-term cultures aids in both better treatment options for bacterial infections and gives insight into the mechanisms involved in bacterial evolution. In the past, it has been difficult to study these organisms in their natural environments. By using experimental evolution in heterogeneous and stressful laboratory conditions, we can more closely mimic natural environments and examine evolutionary mechanisms. One way to observe these mechanisms is to look at transcriptomic and genomic data from cells adapted to these complex conditions. Here, we found that although aged cells increase motility, this increase is not essential for survival in these conditions. These data emphasize that not all changes that occur due to evolutionary processes are adaptive, but these observations could still lead to hypotheses about the causative mutations. The information gained here allow us to make inferences about general mechanisms underlying phenotypic changes due to evolution. 
    more » « less
  4. Abstract Recurrent mutation produces multiple copies of the same allele which may be co-segregating in a population. Yet, most analyses of allele-frequency or site-frequency spectra assume that all observed copies of an allele trace back to a single mutation. We develop a sampling theory for the number of latent mutations in the ancestry of a rare variant, specifically a variant observed in relatively small count in a large sample. Our results follow from the statistical independence of low-count mutations, which we show to hold for the standard neutral coalescent or diffusion model of population genetics as well as for more general coalescent trees. For populations of constant size, these counts are distributed like the number of alleles in the Ewens sampling formula. We develop a Poisson sampling model for populations of varying size and illustrate it using new results for site-frequency spectra in an exponentially growing population. We apply our model to a large data set of human SNPs and use it to explain dramatic differences in site-frequency spectra across the range of mutation rates in the human genome. 
    more » « less
  5. Gaut, Brandon (Ed.)
    Abstract How microbes adapt to a novel environment is a central question in evolutionary biology. Although adaptive evolution must be fueled by beneficial mutations, whether higher mutation rates facilitate the rate of adaptive evolution remains unclear. To address this question, we cultured Escherichia coli hypermutating populations, in which a defective methyl-directed mismatch repair pathway causes a 140-fold increase in single-nucleotide mutation rates. In parallel with wild-type E. coli, populations were cultured in tubes containing Luria-Bertani broth, a complex medium known to promote the evolution of subpopulation structure. After 900 days of evolution, in three transfer schemes with different population-size bottlenecks, hypermutators always exhibited similar levels of improved fitness as controls. Fluctuation tests revealed that the mutation rates of hypermutator lines converged evolutionarily on those of wild-type populations, which may have contributed to the absence of fitness differences. Further genome-sequence analysis revealed that, although hypermutator populations have higher rates of genomic evolution, this largely reflects strong genetic linkage. Despite these linkage effects, the evolved population exhibits parallelism in fixed mutations, including those potentially related to biofilm formation, transcription regulation, and mutation-rate evolution. Together, these results are generally inconsistent with a hypothesized positive relationship between the mutation rate and the adaptive speed of evolution, and provide insight into how clonal adaptation occurs in novel environments. 
    more » « less