skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Detecting Signatures of Positive Selection against a Backdrop of Compensatory Processes
Abstract There are known limitations in methods of detecting positive selection. Common methods do not enable differentiation between positive selection and compensatory covariation, a major limitation. Further, the traditional method of calculating the ratio of nonsynonymous to synonymous substitutions (dN/dS) does not take into account the 3D structure of biomacromolecules nor differences between amino acids. It also does not account for saturation of synonymous mutations (dS) over long evolutionary time that renders codon-based methods ineffective for older divergences. This work aims to address these shortcomings for detecting positive selection through the development of a statistical model that examines clusters of substitutions in clusters of variable radii. Additionally, it uses a parametric bootstrapping approach to differentiate positive selection from compensatory processes. A previously reported case of positive selection in the leptin protein of primates was reexamined using this methodology.  more » « less
Award ID(s):
1817413
PAR ID:
10277412
Author(s) / Creator(s):
; ;
Editor(s):
Crandall, Keith
Date Published:
Journal Name:
Molecular Biology and Evolution
Volume:
37
Issue:
11
ISSN:
0737-4038
Page Range / eLocation ID:
3353 to 3362
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Marine microorganisms inhabiting nutrient-depleted waters play critical roles in global biogeochemical cycles due to their abundance and broad distribution. Many of these microbes share similar genomic features including small genome size, low % G + C content, short intergenic regions, and low nitrogen content in encoded amino acid residue side chains (N-ARSC), but the evolutionary drivers of these characteristics are unclear. Here, we compared the strength of purifying selection across the Marinimicrobia, a candidate phylum which encompasses a broad range of phylogenetic groups with disparate genomic features, by estimating the ratio of nonsynonymous and synonymous substitutions (dN/dS) in conserved marker genes. Our analysis reveals that epipelagic Marinimicrobia that exhibit features consistent with genome streamlining have significantly lower dN/dS values when compared with their mesopelagic counterparts. We also found a significant positive correlation between median dN/dS values and % G + C content, N-ARSC, and intergenic region length. We did not identify a significant correlation between dN/dS ratios and estimated genome size, suggesting the strength of selection is not a primary factor shaping genome size in this group. Our findings are generally consistent with genome streamlining theory, which postulates that many genomic features of abundant epipelagic bacteria are the result of adaptation to oligotrophic nutrient conditions. Our results are also in agreement with previous findings that genome streamlining is common in epipelagic waters, suggesting that microbes inhabiting this region of the ocean have been shaped by strong selection together with prevalent nutritional constraints characteristic of this environment. 
    more » « less
  2. Abstract Following a duplication, the resulting paralogs tend to diverge. While mutation and natural selection can accelerate this process, they can also slow it. Here, we quantify the paralog homogenization that is caused by point mutations and interlocus gene conversion (IGC). Among 164 duplicated teleost genes, the median percentage of postduplication codon substitutions that arise from IGC rather than point mutation is estimated to be between 7% and 8%. By differentiating between the nonsynonymous codon substitutions that homogenize the protein sequences of paralogs and the nonhomogenizing nonsynonymous substitutions, we estimate the homogenizing nonsynonymous rates to be higher for 163 of the 164 teleost data sets as well as for all 14 data sets of duplicated yeast ribosomal protein-coding genes that we consider. For all 14 yeast data sets, the estimated homogenizing nonsynonymous rates exceed the synonymous rates. 
    more » « less
  3. Fay, Justin C. (Ed.)
    Patterns of non-uniform usage of synonymous codons vary across genes in an organism and between species across all domains of life. This codon usage bias (CUB) is due to a combination of non-adaptive (e.g. mutation biases) and adaptive (e.g. natural selection for translation efficiency/accuracy) evolutionary forces. Most models quantify the effects of mutation bias and selection on CUB assuming uniform mutational and other non-adaptive forces across the genome. However, non-adaptive nucleotide biases can vary within a genome due to processes such as biased gene conversion (BGC), potentially obfuscating signals of selection on codon usage. Moreover, genome-wide estimates of non-adaptive nucleotide biases are lacking for non-model organisms. We combine an unsupervised learning method with a population genetics model of synonymous coding sequence evolution to assess the impact of intragenomic variation in non-adaptive nucleotide bias on quantification of natural selection on synonymous codon usage across 49 Saccharomycotina yeasts. We find that in the absence of a priori information, unsupervised learning can be used to identify genes evolving under different non-adaptive nucleotide biases. We find that the impact of intragenomic variation in non-adaptive nucleotide bias varies widely, even among closely-related species. We show that the overall strength and direction of translational selection can be underestimated by failing to account for intragenomic variation in non-adaptive nucleotide biases. Interestingly, genes falling into clusters identified by machine learning are also physically clustered across chromosomes. Our results indicate the need for more nuanced models of sequence evolution that systematically incorporate the effects of variable non-adaptive nucleotide biases on codon frequencies. 
    more » « less
  4. Abstract Animals that engage in long-distance seasonal migration experience strong selective pressures on their metabolic performance and life history, with potential consequences for molecular evolution. Species with slow life histories typically show lower rates of synonymous substitution (dS) than “fast” species. Previous research suggests long-distance seasonal migrants have a slower life history strategy than short-distance migrants, raising the possibility that rates of molecular evolution may covary with migration distance. Additionally, long-distance migrants may face strong selection on metabolically-important mitochondrial genes due to their long-distance flights. Using over 1,000 mitochondrial genomes, we assessed the relationship between migration distance and mitochondrial molecular evolution in 39 boreal-breeding migratory bird species. We show that migration distance correlates negatively with dS, suggesting that the slow life history associated with long-distance migration is reflected in rates of molecular evolution. Mitochondrial genes in every study species exhibited evidence of purifying selection, but the strength of selection was greater in short-distance migrants, contrary to our predictions. This result may indicate effects of selection for cold tolerance on mitochondrial evolution among species overwintering at high latitudes. Our study demonstrates that the pervasive correlation between life history and molecular evolutionary rates exists in the context of differential adaptations to seasonality. 
    more » « less
  5. ‘CandidatusLiberibacter’ is a group of bacterial species that are obligate intracellular plant pathogens and cause Huanglongbing disease of citrus trees and Zebra Chip in potatoes. Here, we examined the extent of intra- and interspecific genetic diversity across the genus using comparative genomics. Our approach examined a wide set ofLiberibactergenome sequences including five pathogenic species and one species not known to cause disease. By performing comparative genomics analyses, we sought to understand the evolutionary history of this genus and to identify genes or genome regions that may affect pathogenicity. With a set of 52 genomes, we performed comparative genomics, measured genome rearrangement, and completed statistical tests of positive selection. We explored markers of genetic diversity across the genus, such as average nucleotide identity across the whole genome. These analyses revealed the highest intraspecific diversity amongst the ‘Ca.Liberibacter solanacearum’ species, which also has the largest plant host range. We identified sets of core and accessory genes across the genus and within each species and measured the ratio of nonsynonymous to synonymous mutations (dN/dS) across genes. We identified ten genes with evidence of a history of positive selection in theLiberibactergenus, including genes in the Tad complex, which have been previously implicated as being highly divergent in the ‘Ca.L. capsica’ species based on high values of dN. 
    more » « less