skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Epistasis Creates Invariant Sites and Modulates the Rate of Molecular Evolution
Abstract Invariant sites are a common feature of amino acid sequence evolution. The presence of invariant sites is frequently attributed to the need to preserve function through site-specific conservation of amino acid residues. Amino acid substitution models without a provision for invariant sites often fit the data significantly worse than those that allow for an excess of invariant sites beyond those predicted by models that only incorporate rate variation among sites (e.g., a Gamma distribution). An alternative is epistasis between sites to preserve residue interactions that can create invariant sites. Through computer-simulated sequence evolution, we evaluated the relative effects of site-specific preferences and site-site couplings in the generation of invariant sites and the modulation of the rate of molecular evolution. In an analysis of ten major families of protein domains with diverse sequence and functional properties, we find that the negative selection imposed by epistasis creates many more invariant sites than site-specific residue preferences alone. Further, epistasis plays an increasingly larger role in creating invariant sites over longer evolutionary periods. Epistasis also dictates rates of domain evolution over time by exerting significant additional purifying selection to preserve site couplings. These patterns illuminate the mechanistic role of epistasis in the processes underlying observed site invariance and evolutionary rates.  more » « less
Award ID(s):
1934848
PAR ID:
10354816
Author(s) / Creator(s):
; ;
Editor(s):
Ozkan, Banu
Date Published:
Journal Name:
Molecular Biology and Evolution
Volume:
39
Issue:
5
ISSN:
0737-4038
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Protein sequence evolution in the presence of epistasis makes many previously acceptable amino acid residues at a site unfavorable over time. This phenomenon of entrenchment has also been observed with neutral substitutions using Potts Hamiltonian models. Here, we show that simulations using these models often evolve non-neutral proteins. We introduce a Neutral-with-Epistasis (N×E) model that incorporates purifying selection to conserve fitness, a requirement of neutral evolution. N×E protein evolution revealed a surprising lack of entrenchment, with site-specific amino-acid preferences remaining remarkably conserved, in biologically realistic time frames despite extensive residue coupling. Moreover, we found that the overdispersion of the molecular clock is caused by rate variation across sites introduced by epistasis in individual lineages, rather than by historical contingency. Therefore, substitutional entrenchment and rate contingency may indicate that adaptive and other non-neutral evolutionary processes were at play during protein evolution. 
    more » « less
  2. Abstract We explore sequence determinants of enzyme activity and specificity in a major enzyme family of terpene synthases. Most enzymes in this family catalyze reactions that produce cyclic terpenes—complex hydrocarbons widely used by plants and insects in diverse biological processes such as defense, communication, and symbiosis. To analyze the molecular mechanisms of emergence of terpene cyclization, we have carried out in-depth examination of mutational space around (E)-β-farnesene synthase, an Artemisia annua enzyme which catalyzes production of a linear hydrocarbon chain. Each mutant enzyme in our synthetic libraries was characterized biochemically, and the resulting reaction rate data were used as input to the Michaelis–Menten model of enzyme kinetics, in which free energies were represented as sums of one-amino-acid contributions and two-amino-acid couplings. Our model predicts measured reaction rates with high accuracy and yields free energy landscapes characterized by relatively few coupling terms. As a result, the Michaelis–Menten free energy landscapes have simple, interpretable structure and exhibit little epistasis. We have also developed biophysical fitness models based on the assumption that highly fit enzymes have evolved to maximize the output of correct products, such as cyclic products or a specific product of interest, while minimizing the output of byproducts. This approach results in nonlinear fitness landscapes that are considerably more epistatic. Overall, our experimental and computational framework provides focused characterization of evolutionary emergence of novel enzymatic functions in the context of microevolutionary exploration of sequence space around naturally occurring enzymes. 
    more » « less
  3. null (Ed.)
    We introduce a model of amino acid sequence evolution that accounts for the statistical behavior of real sequences induced by epistatic interactions. We base the model dynamics on parameters derived from multiple sequence alignments analyzed by using direct coupling analysis methodology. Known statistical properties such as overdispersion, heterotachy, and gamma-distributed rate-across-sites are shown to be emergent properties of this model while being consistent with neutral evolution theory, thereby unifying observations from previously disjointed evolutionary models of sequences. The relationship between site restriction and heterotachy is characterized by tracking the effective alphabet dynamics of sites. We also observe an evolutionary Stokes shift in the fitness of sequences that have undergone evolution under our simulation. By analyzing the structural information of some proteins, we corroborate that the strongest Stokes shifts derive from sites that physically interact in networks near biochemically important regions. Perspectives on the implementation of our model in the context of the molecular clock are discussed. 
    more » « less
  4. Most aspects of the molecular biology of cells involve tightly coordinated intermolecular interactions requiring specific recognition at the nucleotide and/or amino acid levels. This has led to long-standing interest in the degree to which constraints on interacting molecules result in conserved vs. accelerated rates of sequence evolution, with arguments commonly being made that molecular coevolution can proceed at rates exceeding the neutral expectation. Here, a fairly general model is introduced to evaluate the degree to which the rate of evolution at functionally interacting sites is influenced by effective population sizes ( N e ), mutation rates, strength of selection, and the magnitude of recombination between sites. This theory is of particular relevance to matters associated with interactions between organelle- and nuclear-encoded proteins, as the two genomic environments often exhibit dramatic differences in the power of mutation and drift. Although genes within low N e environments can drive the rate of evolution of partner genes experiencing higher N e , rates exceeding the neutral expectation require that the former also have an elevated mutation rate. Testable predictions, some counterintuitive, are presented on how patterns of coevolutionary rates should depend on the relative intensities of drift, selection, and mutation. 
    more » « less
  5. Nearly neutral theory predicts that species with higher effective population size (N_e) are better at purging slightly deleterious mutations. We compare evolution in high N_e vs. low-N_e vertebrates to reveal subtle selective preferences among amino acids. We take three complementary approaches. First, we fit non-stationary substitution models using maximum likelihood, comparing the high-N_e clade of rodents and lagomorphs to its low-N_e sister clade of primates and colugos. Second, we compared evolutionary outcomes across a wider range of vertebrates, via correlations between amino acid frequencies and N_e. Third, we dissected which amino acids substitutions occurred in human, chimpanzee, mouse, and rat, as scored by parsimony – this also enabled comparison to a historical paper. All methods agree on amino acid preference under more effective selection. Preferred amino acids are less costly to synthesize and use GC-rich codons, which are hard to maintain under AT-biased mutation. These factors explain 85% of the variance in amino acid preferences. Parsimony-induced bias in the historical study produces an apparent reduction in structural disorder, perhaps driven by slightly deleterious substitutions in rapidly evolving regions. Within highly exchangeable pairs of amino acids, arginine is strongly preferred over lysine, aspartate over glutamate, and valine over isoleucine, consistent with more effective selection preferring a marginally larger free energy of folding. Two of these preferences (K→R and I→V), but not a third (E→D) match differences between thermophiles and mesophilic relatives. These results reveal the biophysical consequences of mutation-selection-drift balance, and demonstrate the utility of nearly neutral theory for understanding protein evolution. 
    more » « less