skip to main content


Title: Neural networks enable efficient and accurate simulation-based inference of evolutionary parameters from adaptation dynamics
The rate of adaptive evolution depends on the rate at which beneficial mutations are introduced into a population and the fitness effects of those mutations. The rate of beneficial mutations and their expected fitness effects is often difficult to empirically quantify. As these 2 parameters determine the pace of evolutionary change in a population, the dynamics of adaptive evolution may enable inference of their values. Copy number variants (CNVs) are a pervasive source of heritable variation that can facilitate rapid adaptive evolution. Previously, we developed a locus-specific fluorescent CNV reporter to quantify CNV dynamics in evolving populations maintained in nutrient-limiting conditions using chemostats. Here, we use CNV adaptation dynamics to estimate the rate at which beneficial CNVs are introduced through de novo mutation and their fitness effects using simulation-based likelihood–free inference approaches. We tested the suitability of 2 evolutionary models: a standard Wright–Fisher model and a chemostat model. We evaluated 2 likelihood-free inference algorithms: the well-established Approximate Bayesian Computation with Sequential Monte Carlo (ABC-SMC) algorithm, and the recently developed Neural Posterior Estimation (NPE) algorithm, which applies an artificial neural network to directly estimate the posterior distribution. By systematically evaluating the suitability of different inference methods and models, we show that NPE has several advantages over ABC-SMC and that a Wright–Fisher evolutionary model suffices in most cases. Using our validated inference framework, we estimate the CNV formation rate at the GAP1 locus in the yeast Saccharomyces cerevisiae to be 10 −4.7 to 10 −4 CNVs per cell division and a fitness coefficient of 0.04 to 0.1 per generation for GAP1 CNVs in glutamine-limited chemostats. We experimentally validated our inference-based estimates using 2 distinct experimental methods—barcode lineage tracking and pairwise fitness assays—which provide independent confirmation of the accuracy of our approach. Our results are consistent with a beneficial CNV supply rate that is 10-fold greater than the estimated rates of beneficial single-nucleotide mutations, explaining the outsized importance of CNVs in rapid adaptive evolution. More generally, our study demonstrates the utility of novel neural network–based likelihood–free inference methods for inferring the rates and effects of evolutionary processes from empirical data with possible applications ranging from tumor to viral evolution.  more » « less
Award ID(s):
1818234
PAR ID:
10381021
Author(s) / Creator(s):
; ; ; ; ;
Editor(s):
de Visser, J. Arjan
Date Published:
Journal Name:
PLOS Biology
Volume:
20
Issue:
5
ISSN:
1545-7885
Page Range / eLocation ID:
e3001633
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Copy number variants (CNVs) are regions of the genome that vary in integer copy number. CNVs, which comprise both amplifications and deletions of DNA sequence, have been identified across all domains of life, from bacteria and archaea to plants and animals. CNVs are an important source of genetic diversity, and can drive rapid adaptive evolution and progression of heritable and somatic human diseases, such as cancer. However, despite their evolutionary importance and clinical relevance, CNVs remain understudied compared to single-nucleotide variants (SNVs). This is a consequence of the inherent difficulties in detecting CNVs at low-to-intermediate frequencies in heterogeneous populations of cells. Here, we discuss molecular methods used to detect CNVs, the limitations associated with using these techniques, and the application of new and emerging technologies that present solutions to these challenges. The goal of this short review and perspective is to highlight aspects of CNV biology that are understudied and define avenues for further research that address specific gaps in our knowledge of these complex alleles. We describe our recently developed method for CNV detection in which a fluorescent gene functions as a single-cell CNV reporter and present key findings from our evolution experiments in Saccharomyces cerevisiae. Using a CNV reporter, we found that CNVs are generated at a high rate and undergo selection with predictable dynamics across independently evolving replicate populations. Many CNVs appear to be generated through DNA replication-based processes that are mediated by the presence of short, interrupted, inverted-repeat sequences. Our results have important implications for the role of CNVs in evolutionary processes and the molecular mechanisms that underlie CNV formation. We discuss the possible extension of our method to other applications, including tracking the dynamics of CNVs in models of human tumors. 
    more » « less
  2. Abstract

    Germline copy number variants (CNVs) and single-nucleotide polymorphisms (SNPs) form the basis of inter-individual genetic variation. Although the phenotypic effects of SNPs have been extensively investigated, the effects of CNVs is relatively less understood. To better characterize mechanisms by which CNVs affect cellular phenotype, we tested their association with variable CpG methylation in a genome-wide manner. Using paired CNV and methylation data from the 1000 genomes and HapMap projects, we identified genome-wide associations by methylation quantitative trait locus (mQTL) analysis. We found individual CNVs being associated with methylation of multiple CpGs and vice versa. CNV-associated methylation changes were correlated with gene expression. CNV-mQTLs were enriched for regulatory regions, transcription factor-binding sites (TFBSs), and were involved in long-range physical interactions with associated CpGs. Some CNV-mQTLs were associated with methylation of imprinted genes. Several CNV-mQTLs and/or associated genes were among those previously reported by genome-wide association studies (GWASs). We demonstrate that germline CNVs in the genome are associated with CpG methylation. Our findings suggest that structural variation together with methylation may affect cellular phenotype.

     
    more » « less
  3. The mutation rate and mutations' effects on fitness are crucial to evolution. Mutation rates are under selection due to linkage between mutation rate modifiers and mutations' effects on fitness. The linkage between a higher mutation rate and more beneficial mutations selects for higher mutation rates, while the linkage between a higher mutation rate and more deleterious mutations selects for lower mutation rates. The net direction of selection on mutations rates depends on the fitness landscape, and a great deal of work has elucidated the fitness landscapes of mutations. However, tests of the effect of varying a mutation rate on evolution in a single organism in a single environment have been difficult. This has been studied using strains of antimutators and mutators, but these strains may differ in additional ways and typically do not allow for continuous variation of the mutation rate. To help investigate the effects of the mutation rate on evolution, we have genetically engineered a strain of E. coli with a point mutation rate that can be smoothly varied over two orders of magnitude. We did this by engineering a strain with inducible control of the mismatch repair proteins MutH and MutL. We used this strain in an approximately 350 generation evolution experiment with controlled variation of the mutation rate. We confirmed the construct and the mutation rate were stable over this time. Sequencing evolved strains revealed a higher number of single nucleotide polymorphisms at higher mutations rates, likely due to either the beneficial effects of these mutations or their linkage to beneficial mutations. 
    more » « less
  4. Gaut, Brandon (Ed.)
    Abstract How microbes adapt to a novel environment is a central question in evolutionary biology. Although adaptive evolution must be fueled by beneficial mutations, whether higher mutation rates facilitate the rate of adaptive evolution remains unclear. To address this question, we cultured Escherichia coli hypermutating populations, in which a defective methyl-directed mismatch repair pathway causes a 140-fold increase in single-nucleotide mutation rates. In parallel with wild-type E. coli, populations were cultured in tubes containing Luria-Bertani broth, a complex medium known to promote the evolution of subpopulation structure. After 900 days of evolution, in three transfer schemes with different population-size bottlenecks, hypermutators always exhibited similar levels of improved fitness as controls. Fluctuation tests revealed that the mutation rates of hypermutator lines converged evolutionarily on those of wild-type populations, which may have contributed to the absence of fitness differences. Further genome-sequence analysis revealed that, although hypermutator populations have higher rates of genomic evolution, this largely reflects strong genetic linkage. Despite these linkage effects, the evolved population exhibits parallelism in fixed mutations, including those potentially related to biofilm formation, transcription regulation, and mutation-rate evolution. Together, these results are generally inconsistent with a hypothesized positive relationship between the mutation rate and the adaptive speed of evolution, and provide insight into how clonal adaptation occurs in novel environments. 
    more » « less
  5. Significance

    The role of mutations of large effect in adaptive evolution is a question of enduring interest. Large-effect mutations were once seen as unlikely contributors to adaptation, but we now have numerous examples. A major shortcoming of the evidence is the lack of information on fitness effects of mutations. We conducted a quantitative trait locus study that mapped fitness in an experimental field population of stickleback to a large-effect gene,Ectodysplasin(Eda). We compared this result with allele frequency change at the gene in a young lake population, which also revealed strong natural selection and large fitness effects of theEdagene and/or linked genes. Selection on ancient genetic variants may increase the prevalence of large-effect fitness variants in adaptive evolution.

     
    more » « less