skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Interlocus Gene Conversion, Natural Selection, and Paralog Homogenization
Abstract Following a duplication, the resulting paralogs tend to diverge. While mutation and natural selection can accelerate this process, they can also slow it. Here, we quantify the paralog homogenization that is caused by point mutations and interlocus gene conversion (IGC). Among 164 duplicated teleost genes, the median percentage of postduplication codon substitutions that arise from IGC rather than point mutation is estimated to be between 7% and 8%. By differentiating between the nonsynonymous codon substitutions that homogenize the protein sequences of paralogs and the nonhomogenizing nonsynonymous substitutions, we estimate the homogenizing nonsynonymous rates to be higher for 163 of the 164 teleost data sets as well as for all 14 data sets of duplicated yeast ribosomal protein-coding genes that we consider. For all 14 yeast data sets, the estimated homogenizing nonsynonymous rates exceed the synonymous rates.  more » « less
Award ID(s):
2241312 1754142
PAR ID:
10463098
Author(s) / Creator(s):
; ; ; ; ; ;
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
Molecular Biology and Evolution
Volume:
40
Issue:
9
ISSN:
0737-4038
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Van De Peer, Yves (Ed.)
    Abstract Analyses in a number of organisms have shown that duplicated genes are less likely to be essential than singletons. This implies that genes can often compensate for the loss of their paralogs. However, it is unclear why the loss of some duplicates can be compensated by their paralogs, whereas the loss of other duplicates cannot. Surprisingly, initial analyses in mice did not detect differences in the essentiality of duplicates and singletons. Only subsequent analyses, using larger gene knockout data sets and controlling for a number of confounding factors, did detect significant differences. Previous studies have not taken into account the tissues in which duplicates are expressed. We hypothesized that in complex organisms, in order for a gene’s loss to be compensated by one or more of its paralogs, such paralogs need to be expressed in at least the same set of tissues as the lost gene. To test our hypothesis, we classified mouse duplicates into two categories based on the expression patterns of their paralogs: “compensable duplicates” (those with paralogs expressed in all the tissues in which the gene is expressed) and “noncompensable duplicates” (those whose paralogs are not expressed in all the tissues where the gene is expressed). In agreement with our hypothesis, the essentiality of noncompensable duplicates is similar to that of singletons, whereas compensable duplicates exhibit a substantially lower essentiality. Our results imply that duplicates can often compensate for the loss of their paralogs, but only if they are expressed in the same tissues. Indeed, the compensation ability is more dependent on expression patterns than on protein sequence similarity. The existence of these two kinds of duplicates with different essentialities, which has been overlooked by prior studies, may have hindered the detection of differences between singletons and duplicates. 
    more » « less
  2. Johnson, Patricia J (Ed.)
    ABSTRACT Analyses of codon usage in eukaryotes suggest that amino acid usage responds to GC pressure so AT-biased substitutions drive higher usage of amino acids with AT-ending codons. Here, we combine single-cell transcriptomics and phylogenomics to explore codon usage patterns in foraminifera, a diverse and ancient clade of predominantly uncultivable microeukaryotes. We curate data from 1,044 gene families in 49 individuals representing 28 genera, generating perhaps the largest existing dataset of data from a predominantly uncultivable clade of protists, to analyze compositional bias and codon usage. We find extreme variation in composition, with a median GC content at fourfold degenerate silent sites below 3% in some species and above 75% in others. The most AT-biased species are distributed among diverse non-monophyletic lineages. Surprisingly, despite the extreme variation in compositional bias, amino acid usage is highly conserved across all foraminifera. By analyzing nucleotide, codon, and amino acid composition within this diverse clade of amoeboid eukaryotes, we expand our knowledge of patterns of genome evolution across the eukaryotic tree of life.IMPORTANCEPatterns of molecular evolution in protein-coding genes reflect trade-offs between substitution biases and selection on both codon and amino acid usage. Most analyses of these factors in microbial eukaryotes focus on model species such asAcanthamoeba, Plasmodium,and yeast, where substitution bias is a primary contributor to patterns of amino acid usage. Foraminifera, an ancient clade of single-celled eukaryotes, present a conundrum, as we find highly conserved amino acid usage underlain by divergent nucleotide composition, including extreme AT-bias at silent sites among multiple non-sister lineages. We speculate that these paradoxical patterns are enabled by the dynamic genome structure of foraminifera, whose life cycles can include genome endoreplication and chromatin extrusion. 
    more » « less
  3. Villanueva, Laura (Ed.)
    ABSTRACT Experimental evolution provides a powerful tool for examining how Bdellovibrio evolves in response to unique selective pressures associated with its predatory lifestyle. We tested how Bdellovibrio sp. NC01 adapts to long-term coculture with Pseudomonas sp. NC02, which is less susceptible to predation compared to other Gram-negative bacteria. Analyzing six replicate Bdellovibrio populations across six time points spanning 40 passages and 2,880 h of coculture, we detected 30 to 40 new mutations in each population that exceeded a frequency of 5%. Nonsynonymous substitutions were the most abundant type of new mutation, followed by small indels and synonymous substitutions. After completing the final passage, we detected 20 high-frequency (>75%) mutations across all six evolved Bdellovibrio populations. Eighteen of these alter protein sequences, and most increased in frequency rapidly. Four genes acquired a high-frequency mutation in two or more evolved Bdellovibrio populations, reflecting parallel evolution and positive selection. The genes encode a sodium/phosphate cotransporter family protein (Bd2221), a metallophosphoesterase (Bd0054), a TonB family protein (Bd0396), and a hypothetical protein (Bd1601). Tested prey range and predation efficiency phenotypes did not differ significantly between evolved Bdellovibrio populations and the ancestor; however, all six evolved Bdellovibrio populations demonstrated enhanced starvation survival compared to the ancestor. These results suggest that, instead of evolving improved killing of Pseudomonas sp. NC02, Bdellovibrio evolved to better withstand nutrient limitation in the presence of this prey strain. The mutations identified here point to genes and functions that may be important for Bdellovibrio adaptation to the different selective pressures of long-term coculture with Pseudomonas . IMPORTANCE Bdellovibrio attack and kill Gram-negative bacteria, including drug-resistant pathogens of animals and plants. This lifestyle is unusual among bacteria, and it imposes unique selective pressures on Bdellovibrio . Determining how Bdellovibrio evolve in response to these pressures is valuable for understanding the mechanisms that govern predation. We applied experimental evolution to test how Bdellovibrio sp. NC01 evolved in response to long-term coculture with a single Pseudomonas strain, which NC01 can kill, but with low efficiency. Our experimental design imposed different selective pressures on the predatory bacteria and tracked the evolutionary trajectories of replicate Bdellovibrio populations. Using genome sequencing, we identified Bdellovibrio genes that acquired high-frequency mutations in two or more populations. Using phenotype assays, we determined that evolved Bdellovibrio populations did not improve their ability to kill Pseudomonas , but rather are better able to survive starvation. Overall, our results point to functions that may be important for Bdellovibrio adaptation. 
    more » « less
  4. Abstract Gene duplication is a fundamental part of evolutionary innovation. While single-gene duplications frequently exhibit asymmetric evolutionary rates between paralogs, the extent to which this applies to multi-gene duplications remains unclear. In this study, we investigate the role of genetic context in shaping evolutionary divergence within multi-gene duplications, leveraging microsynteny to differentiate source and target copies. Using a dataset of 193 mammalian genome assemblies and a bird outgroup, we systematically analyze patterns of sequence divergence between duplicated genes and reference orthologs. We find that target copies, those relocated to new genomic environments, exhibit elevated evolutionary rates compared to source copies in the ancestral location. This asymmetry is influenced by the distance between copies and the size of the target copy. We also demonstrate that the polarization of rate asymmetry in paralogs, the “choice” of the slowly evolving copy, is biased towards collective, block-wise polarization in multi-gene duplications. Our findings highlight the importance of genetic context in modulating post-duplication divergence, where differences in cis-regulatory elements and co-expressed gene clusters between source and target copies may be responsible. This study presents a large-scale test of asymmetric evolution in multi-gene duplications, offering new insight into how genome architecture shapes functional diversification of paralogs. Significance statementAfter a gene is duplicated, reduced selective constraints can lead the two copies to rapidly diverge, with one copy often evolving faster and occasionally gaining a new function. We quantify the influence of genetic context in choosing which copy of a duplicated gene has an elevated substitution rate. In a representative dataset of 193 mammalian genomes, we found strong evidence that gene copies pasted into new genomic locations tend to evolve faster than the corresponding copies in ancestral locations, suggesting an important role for the regulatory environment. The asymmetry in evolutionary rates of duplicated genes persists even for very large multigenic duplications, up to the scale of megabases, indicating that regulatory interactions frequently reach farther than previously thought. 
    more » « less
  5. Crandall, Keith (Ed.)
    Abstract There are known limitations in methods of detecting positive selection. Common methods do not enable differentiation between positive selection and compensatory covariation, a major limitation. Further, the traditional method of calculating the ratio of nonsynonymous to synonymous substitutions (dN/dS) does not take into account the 3D structure of biomacromolecules nor differences between amino acids. It also does not account for saturation of synonymous mutations (dS) over long evolutionary time that renders codon-based methods ineffective for older divergences. This work aims to address these shortcomings for detecting positive selection through the development of a statistical model that examines clusters of substitutions in clusters of variable radii. Additionally, it uses a parametric bootstrapping approach to differentiate positive selection from compensatory processes. A previously reported case of positive selection in the leptin protein of primates was reexamined using this methodology. 
    more » « less