skip to main content


Title: The Influence of Higher-Order Epistasis on Biological Fitness Landscape Topography
The effect of a mutation on the organism often depends on what other mutations are already present in its genome. Geneticists refer to such mutational interactions as epistasis. Pairwise epistatic effects have been recognized for over a century, and their evolutionary implications have received theoretical attention for nearly as long. However, pairwise epistatic interactions themselves can vary with genomic background. This is called higher-order epistasis, and its consequences for evolution are much less well understood. Here, we assess the influence that higher-order epistasis has on the topography of 16 published, biological fitness landscapes. We find that on average, their effects on fitness landscape declines with order, and suggest that notable exceptions to this trend may deserve experimental scrutiny. We conclude by highlighting opportunities for further theoretical and experimental work dissecting the influence that epistasis of all orders has on fitness landscape topography and on the efficiency of evolution by natural selection.  more » « less
Award ID(s):
1736253
NSF-PAR ID:
10056404
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Journal of statistical physics
ISSN:
1572-9613
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Gallicchio, Emilio (Ed.)
    The rapid evolution of HIV is constrained by interactions between mutations which affect viral fitness. In this work, we explore the role of epistasis in determining the mutational fitness landscape of HIV for multiple drug target proteins, including Protease, Reverse Transcriptase, and Integrase. Epistatic interactions between residues modulate the mutation patterns involved in drug resistance, with unambiguous signatures of epistasis best seen in the comparison of the Potts model predicted and experimental HIV sequence “prevalences” expressed as higher-order marginals (beyond triplets) of the sequence probability distribution. In contrast, experimental measures of fitness such as viral replicative capacities generally probe fitness effects of point mutations in a single background, providing weak evidence for epistasis in viral systems. The detectable effects of epistasis are obscured by higher evolutionary conservation at sites. While double mutant cycles in principle, provide one of the best ways to probe epistatic interactions experimentally without reference to a particular background, we show that the analysis is complicated by the small dynamic range of measurements. Overall, we show that global pairwise interaction Potts models are necessary for predicting the mutational landscape of viral proteins. 
    more » « less
  2. A longstanding goal of biology is to identify the key genes and species that critically impact evolution, ecology, and health. Network analysis has revealed keystone species that regulate ecosystems and master regulators that regulate cellular genetic networks. Yet these studies have focused on pairwise biological interactions, which can be affected by the context of genetic background and other species present, generating higher-order interactions. The important regulators of higher-order interactions are unstudied. To address this, we applied a high-dimensional geometry approach that quantifies epistasis in a fitness landscape to ask how individual genes and species influence the interactions in the rest of the biological network. We then generated and also reanalyzed 5-dimensional datasets (two genetic, two microbiome). We identified key genes (e.g., therbslocus andpykF) and species (e.g.,Lactobacilli) that control the interactions of many other genes and species. These higher-order master regulators can induce or suppress evolutionary and ecological diversification by controlling the topography of the fitness landscape. Thus, we provide a method and mathematical justification for exploration of biological networks in higher dimensions.

     
    more » « less
  3. A fitness landscape is a map between the genotype and its reproductive success in a given environment. The topography of fitness landscapes largely governs adaptive dynamics, constraining evolutionary trajectories and the predictability of evolution. Theory suggests that this topography can be deformed by mutations that produce substantial changes to the environment. Despite its importance, the deformability of fitness landscapes has not been systematically studied beyond abstract models, and little is known about its reach and consequences in empirical systems. Here we have systematically characterized the deformability of the genome-wide metabolic fitness landscape of the bacteriumEscherichia coli. Deformability is quantified by the noncommutativity of epistatic interactions, which we experimentally demonstrate in mutant strains on the path to an evolutionary innovation. Our analysis shows that the deformation of fitness landscapes by metabolic mutations rarely affects evolutionary trajectories in the short range. However, mutations with large environmental effects produce long-range landscape deformations in distant regions of the genotype space that affect the fitness of later descendants. Our results therefore suggest that, even in situations in which mutations have strong environmental effects, fitness landscapes may retain their power to forecast evolution over small mutational distances despite the potential attenuation of that power over longer evolutionary trajectories. Our methods and results provide an avenue for integrating adaptive and eco-evolutionary dynamics with complex genetics and genomics.

     
    more » « less
  4. Abstract

    Interactions between mutations (epistasis) can add substantial complexity to genotype-phenotype maps, hampering our ability to predict evolution. Yet, recent studies have shown that the fitness effect of a mutation can often be predicted from the fitness of its genetic background using simple, linear relationships. This phenomenon, termedglobal epistasis, has been leveraged to reconstruct fitness landscapes and infer adaptive trajectories in a wide variety of contexts. However, little attention has been paid to how patterns of global epistasis may be affected by environmental variation, despite this variation frequently being a major driver of evolution. This is particularly relevant for the evolution of drug resistance, where antimicrobial drugs may change the environment faced by pathogens and shape their adaptive trajectories in ways that can be difficult to predict. By analyzing a fitness landscape of four mutations in a gene encoding an essential enzyme ofP. falciparum(a parasite cause of malaria), here we show that patterns of global epistasis can be strongly modulated by the concentration of a drug in the environment. Expanding on previous theoretical results, we demonstrate that this modulation can be quantitatively explained by how specific gene-by-gene interactions are modified by drug dose. Importantly, our results highlight the need to incorporate potential environmental variation into the global epistasis framework in order to predict adaptation in dynamic environments.

     
    more » « less
  5. Ribozymes are RNA molecules that catalyze biochemical reactions. Self-cleaving ribozymes are a common naturally occurring class of ribozymes that catalyze site-specific cleavage of their own phosphodiester backbone. In addition to their natural functions, self-cleaving ribozymes have been used to engineer control of gene expression because they can be designed to alter RNA processing and stability. However, the rational design of ribozyme activity remains challenging, and many ribozyme-based systems are engineered or improved by random mutagenesis and selection ( in vitro evolution). Improving a ribozyme-based system often requires several mutations to achieve the desired function, but extensive pairwise and higher-order epistasis prevent a simple prediction of the effect of multiple mutations that is needed for rational design. Recently, high-throughput sequencing-based approaches have produced data sets on the effects of numerous mutations in different ribozymes (RNA fitness landscapes). Here we used such high-throughput experimental data from variants of the CPEB3 self-cleaving ribozyme to train a predictive model through machine learning approaches. We trained models using either a random forest or long short-term memory (LSTM) recurrent neural network approach. We found that models trained on a comprehensive set of pairwise mutant data could predict active sequences at higher mutational distances, but the correlation between predicted and experimentally observed self-cleavage activity decreased with increasing mutational distance. Adding sequences with increasingly higher numbers of mutations to the training data improved the correlation at increasing mutational distances. Systematically reducing the size of the training data set suggests that a wide distribution of ribozyme activity may be the key to accurate predictions. Because the model predictions are based only on sequence and activity data, the results demonstrate that this machine learning approach allows readily obtainable experimental data to be used for RNA design efforts even for RNA molecules with unknown structures. The accurate prediction of RNA functions will enable a more comprehensive understanding of RNA fitness landscapes for studying evolution and for guiding RNA-based engineering efforts. 
    more » « less