Abstract Protein sequence evolution in the presence of epistasis makes many previously acceptable amino acid residues at a site unfavorable over time. This phenomenon of entrenchment has also been observed with neutral substitutions using Potts Hamiltonian models. Here, we show that simulations using these models often evolve non-neutral proteins. We introduce a Neutral-with-Epistasis (N×E) model that incorporates purifying selection to conserve fitness, a requirement of neutral evolution. N×E protein evolution revealed a surprising lack of entrenchment, with site-specific amino-acid preferences remaining remarkably conserved, in biologically realistic time frames despite extensive residue coupling. Moreover, we found that the overdispersion of the molecular clock is caused by rate variation across sites introduced by epistasis in individual lineages, rather than by historical contingency. Therefore, substitutional entrenchment and rate contingency may indicate that adaptive and other non-neutral evolutionary processes were at play during protein evolution.
more »
« less
The effectiveness of selection in a species affects the direction of amino acid frequency evolution
Nearly neutral theory predicts that species with higher effective population size (N_e) are better at purging slightly deleterious mutations. We compare evolution in high N_e vs. low-N_e vertebrates to reveal subtle selective preferences among amino acids. We take three complementary approaches. First, we fit non-stationary substitution models using maximum likelihood, comparing the high-N_e clade of rodents and lagomorphs to its low-N_e sister clade of primates and colugos. Second, we compared evolutionary outcomes across a wider range of vertebrates, via correlations between amino acid frequencies and N_e. Third, we dissected which amino acids substitutions occurred in human, chimpanzee, mouse, and rat, as scored by parsimony – this also enabled comparison to a historical paper. All methods agree on amino acid preference under more effective selection. Preferred amino acids are less costly to synthesize and use GC-rich codons, which are hard to maintain under AT-biased mutation. These factors explain 85% of the variance in amino acid preferences. Parsimony-induced bias in the historical study produces an apparent reduction in structural disorder, perhaps driven by slightly deleterious substitutions in rapidly evolving regions. Within highly exchangeable pairs of amino acids, arginine is strongly preferred over lysine, aspartate over glutamate, and valine over isoleucine, consistent with more effective selection preferring a marginally larger free energy of folding. Two of these preferences (K→R and I→V), but not a third (E→D) match differences between thermophiles and mesophilic relatives. These results reveal the biophysical consequences of mutation-selection-drift balance, and demonstrate the utility of nearly neutral theory for understanding protein evolution.
more »
« less
- Award ID(s):
- 2333243
- PAR ID:
- 10621344
- Publisher / Repository:
- bioRxiv
- Date Published:
- Format(s):
- Medium: X
- Institution:
- bioRxiv
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Johnson, Patricia J (Ed.)ABSTRACT Analyses of codon usage in eukaryotes suggest that amino acid usage responds to GC pressure so AT-biased substitutions drive higher usage of amino acids with AT-ending codons. Here, we combine single-cell transcriptomics and phylogenomics to explore codon usage patterns in foraminifera, a diverse and ancient clade of predominantly uncultivable microeukaryotes. We curate data from 1,044 gene families in 49 individuals representing 28 genera, generating perhaps the largest existing dataset of data from a predominantly uncultivable clade of protists, to analyze compositional bias and codon usage. We find extreme variation in composition, with a median GC content at fourfold degenerate silent sites below 3% in some species and above 75% in others. The most AT-biased species are distributed among diverse non-monophyletic lineages. Surprisingly, despite the extreme variation in compositional bias, amino acid usage is highly conserved across all foraminifera. By analyzing nucleotide, codon, and amino acid composition within this diverse clade of amoeboid eukaryotes, we expand our knowledge of patterns of genome evolution across the eukaryotic tree of life.IMPORTANCEPatterns of molecular evolution in protein-coding genes reflect trade-offs between substitution biases and selection on both codon and amino acid usage. Most analyses of these factors in microbial eukaryotes focus on model species such asAcanthamoeba, Plasmodium,and yeast, where substitution bias is a primary contributor to patterns of amino acid usage. Foraminifera, an ancient clade of single-celled eukaryotes, present a conundrum, as we find highly conserved amino acid usage underlain by divergent nucleotide composition, including extreme AT-bias at silent sites among multiple non-sister lineages. We speculate that these paradoxical patterns are enabled by the dynamic genome structure of foraminifera, whose life cycles can include genome endoreplication and chromatin extrusion.more » « less
-
Most aspects of the molecular biology of cells involve tightly coordinated intermolecular interactions requiring specific recognition at the nucleotide and/or amino acid levels. This has led to long-standing interest in the degree to which constraints on interacting molecules result in conserved vs. accelerated rates of sequence evolution, with arguments commonly being made that molecular coevolution can proceed at rates exceeding the neutral expectation. Here, a fairly general model is introduced to evaluate the degree to which the rate of evolution at functionally interacting sites is influenced by effective population sizes ( N e ), mutation rates, strength of selection, and the magnitude of recombination between sites. This theory is of particular relevance to matters associated with interactions between organelle- and nuclear-encoded proteins, as the two genomic environments often exhibit dramatic differences in the power of mutation and drift. Although genes within low N e environments can drive the rate of evolution of partner genes experiencing higher N e , rates exceeding the neutral expectation require that the former also have an elevated mutation rate. Testable predictions, some counterintuitive, are presented on how patterns of coevolutionary rates should depend on the relative intensities of drift, selection, and mutation.more » « less
-
Many SNPs are predicted to encode deleterious amino acid variants. These slightly deleterious mutations can provide unique insights into population history, the dynamics of selection, and the genetic bases of phenotypes. This is especially true for domesticated species, where a history of bottlenecks and selection may affect the frequency of deleterious variants and signal a “cost of domestication”. Here, we investigated the numbers and frequencies of deleterious variants in Asian rice (Oryza sativa), focusing on two varieties (japonica and indica) and their wild relative (O. rufipogon). We investigated three signals of a potential cost of domestication in Asian rice relative to O. rufipogon: an increase in the frequency of deleterious SNPs (dSNPs), an enrichment of dSNPs compared with synonymous SNPs (sSNPs), and an increased number of deleterious variants. We found evidence for all three signals, and domesticated individuals con- tained 3–4% more deleterious alleles than wild individuals. Deleterious variants were enriched within low recombin- ation regions of the genome and experienced frequency increases similar to sSNPs within regions of putative selective sweeps. A characteristic feature of rice domestication was a shift in mating system from outcrossing to predominantly selfing. Forward simulations suggest that this shift in mating system may have been the dominant factor in shaping both deleterious and neutral diversity in rice.more » « less
-
Abstract Sexual selection has a rich history of mathematical models that consider why preferences favor one trait phenotype over another (for population genetic models) or what specific trait value is preferred (for quantitative genetic models). Less common is exploration of the evolution of choosiness or preference strength: i.e., by how much a trait is preferred. We examine both population and quantitative genetic models of the evolution of preferences, specifically developing “baseline models” of the evolution of preference strength during the Fisher process. Using a population genetic approach, we find selection for stronger and stronger preferences when trait variation is maintained by mutation. However, this force is quite weak and likely to be swamped by drift in moderately-sized populations. In a quantitative genetic model, unimodal preferences will generally not evolve to be increasingly strong without bounds when male traits are under stabilizing viability selection, but evolve to extreme values when viability selection is directional. Our results highlight that different shapes of fitness and preference functions lead to qualitatively different trajectories for preference strength evolution ranging from no evolution to extreme evolution of preference strength.more » « less
An official website of the United States government

