Abstract Gene duplication is a source of evolutionary novelty. DNA methylation may play a role in the evolution of duplicate genes (paralogs) through its association with gene expression. While this relationship has been examined to varying extents in a few individual species, the generalizability of these results at either a broad phylogenetic scale with species of differing duplication histories or across a population remains unknown. We applied a comparative epigenomic approach to 43 angiosperm species across the phylogeny and a population of 928 Arabidopsis (Arabidopsis thaliana) accessions, examining the association of DNA methylation with paralog evolution. Genic DNA methylation was differentially associated with duplication type, the age of duplication, sequence evolution, and gene expression. Whole-genome duplicates were typically enriched for CG-only gene body methylated or unmethylated genes, while single-gene duplications were typically enriched for non-CG methylated or unmethylated genes. Non-CG methylation, in particular, was a characteristic of more recent single-gene duplicates. Core angiosperm gene families were differentiated into those which preferentially retain paralogs and “duplication-resistant” families, which convergently reverted to singletons following duplication. Duplication-resistant families that still have paralogous copies were, uncharacteristically for core angiosperm genes, enriched for non-CG methylation. Non-CG methylated paralogs had higher rates of sequence evolution, higher frequency of presence–absence variation, and more limited expression. This suggests that silencing by non-CG methylation may be important to maintaining dosage following duplication and be a precursor to fractionation. Our results indicate that genic methylation marks differing evolutionary trajectories and fates between paralogous genes and have a role in maintaining dosage following duplication.
more »
« less
Problems with Paralogs: The Promise and Challenges of Gene Duplicates in Evo-Devo Research
Synopsis Gene duplicates, or paralogs, serve as a major source of new genetic material and comprise seeds for evolutionary innovation. While originally thought to be quickly lost or nonfunctionalized following duplication, now a vast number of paralogs are known to be retained in a functional state. Daughter paralogs can provide robustness through redundancy, specialize via sub-functionalization, or neo-functionalize to play new roles. Indeed, the duplication and divergence of developmental genes have played a monumental role in the evolution of animal forms (e.g., Hox genes). Still, despite their prevalence and evolutionary importance, the precise detection of gene duplicates in newly sequenced genomes remains technically challenging and often overlooked. This presents an especially pertinent problem for evolutionary developmental biology, where hypothesis testing requires accurate detection of changes in gene expression and function, often in nontraditional model species. Frequently, these analyses rely on molecular reagents designed within coding sequences that may be highly similar in recently duplicated paralogs, leading to cross-reactivity and spurious results. Thus, care is needed to avoid erroneously assigning diverged functions of paralogs to a single gene, and potentially misinterpreting evolutionary history. This perspective aims to overview the prevalence and importance of paralogs and to shed light on the difficulty of their detection and analysis while offering potential solutions.
more »
« less
- PAR ID:
- 10542418
- Publisher / Repository:
- Oxford University Press
- Date Published:
- Journal Name:
- Integrative And Comparative Biology
- Volume:
- 64
- Issue:
- 2
- ISSN:
- 1540-7063
- Format(s):
- Medium: X Size: p. 556-564
- Size(s):
- p. 556-564
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract The duplication of genes has long been recognized as a substrate for evolutionary novelty and adaptation, but the factors that govern fixation of paralogs soon after duplication are only partially understood. Duplication often leads to an increase in gene dosage, or the amount of functional gene product. For genes with which an increased dosage is harmful (i.e., triplosensitive genes), a dosage balancing mechanism needs to be present immediately after duplication if it is to evade negative selection. Previous research in vertebrates has demonstrated a potential role for epigenetic factors in allowing triplosensitive genes to increase in copy number by regulating their expression post-duplication. Here we expand this research by investigating the epigenetic landscape of duplicate genes inD. discoideum, a basal lineage separated from humans by over a billion years. We found that activating histone modifications are quickly lost in duplicate genes before gradually increasing in enrichment as paralogs age. For the repressive modification H3K9me3, we found it was enriched in the youngest paralogs, and that this enrichment was likely mediated by heterochromatin spread from transposable elements. We similarly found enrichment of H3K9me3 in young human duplicates, and again found transposable elements as a potential mediator. Finally, we leveraged recent genome-wide estimates of triplosensitivity in human genes to directly examine the relationship between this kind of dosage sensitivity and enrichment for repressive histone modifications. Interestingly, while we found no significant link between enrichment for the repressive mark H3K9me3 and triplosensitivity in human paralogs, we did find a significant association between triplosensitivity and transposon proximity. Our findings suggest that transposons may contribute to the epigenetic regulatory environment associated with dosage balancing of young duplicates in both protists and humans.more » « less
-
Van De Peer, Yves (Ed.)Abstract Analyses in a number of organisms have shown that duplicated genes are less likely to be essential than singletons. This implies that genes can often compensate for the loss of their paralogs. However, it is unclear why the loss of some duplicates can be compensated by their paralogs, whereas the loss of other duplicates cannot. Surprisingly, initial analyses in mice did not detect differences in the essentiality of duplicates and singletons. Only subsequent analyses, using larger gene knockout data sets and controlling for a number of confounding factors, did detect significant differences. Previous studies have not taken into account the tissues in which duplicates are expressed. We hypothesized that in complex organisms, in order for a gene’s loss to be compensated by one or more of its paralogs, such paralogs need to be expressed in at least the same set of tissues as the lost gene. To test our hypothesis, we classified mouse duplicates into two categories based on the expression patterns of their paralogs: “compensable duplicates” (those with paralogs expressed in all the tissues in which the gene is expressed) and “noncompensable duplicates” (those whose paralogs are not expressed in all the tissues where the gene is expressed). In agreement with our hypothesis, the essentiality of noncompensable duplicates is similar to that of singletons, whereas compensable duplicates exhibit a substantially lower essentiality. Our results imply that duplicates can often compensate for the loss of their paralogs, but only if they are expressed in the same tissues. Indeed, the compensation ability is more dependent on expression patterns than on protein sequence similarity. The existence of these two kinds of duplicates with different essentialities, which has been overlooked by prior studies, may have hindered the detection of differences between singletons and duplicates.more » « less
-
A fundamental focus of evolutionary-developmental biology is uncovering the genetic mechanisms responsible for the gain and loss of characters. One approach to this question is to investigate changes in the coordinated expression of a group of genes important for the development of a character of interest (a gene regulatory network). Here we consider the possibility that modifications to the wing gene regulatory network (wGRN), as defined by work primarily done in Drosophila melanogaster, were involved in the evolution of wing dimorphisms of the pea aphid (Acyrthosiphon pisum). We hypothesize that this may have occurred via changes in expression levels or duplication followed by sub-functionalization of wGRN components. To test this, we annotated members of the wGRN in the pea aphid genome and assessed their expression levels in first and third nymphal instars of winged and wingless morphs of males and asexual females. We find that only two of the 32 assessed genes exhibit morph-biased expression. We also find that three wing genes (apterous (ap), warts (wts), and decapentaplegic (dpp)) have undergone gene duplication. In each case, the resulting paralogs show signs of functional divergence, exhibiting either sex-, morph-, or stage-specific expression. Two gene duplicates, wts2 and dpp3, are of particular interest with respect to wing dimorphism, as they exhibit a wingless male-specific isoform and wingless male-biased expression, respectively. These results supplement our understanding of trends in developmental gene network evolution, such as side-stepping pleiotropic constraint via duplication and sub-functionalization, underlying the emergence of novel phenotypes.more » « less
-
Abstract Gene duplication is a fundamental part of evolutionary innovation. While single-gene duplications frequently exhibit asymmetric evolutionary rates between paralogs, the extent to which this applies to multi-gene duplications remains unclear. In this study, we investigate the role of genetic context in shaping evolutionary divergence within multi-gene duplications, leveraging microsynteny to differentiate source and target copies. Using a dataset of 193 mammalian genome assemblies and a bird outgroup, we systematically analyze patterns of sequence divergence between duplicated genes and reference orthologs. We find that target copies, those relocated to new genomic environments, exhibit elevated evolutionary rates compared to source copies in the ancestral location. This asymmetry is influenced by the distance between copies and the size of the target copy. We also demonstrate that the polarization of rate asymmetry in paralogs, the “choice” of the slowly evolving copy, is biased towards collective, block-wise polarization in multi-gene duplications. Our findings highlight the importance of genetic context in modulating post-duplication divergence, where differences in cis-regulatory elements and co-expressed gene clusters between source and target copies may be responsible. This study presents a large-scale test of asymmetric evolution in multi-gene duplications, offering new insight into how genome architecture shapes functional diversification of paralogs. Significance statementAfter a gene is duplicated, reduced selective constraints can lead the two copies to rapidly diverge, with one copy often evolving faster and occasionally gaining a new function. We quantify the influence of genetic context in choosing which copy of a duplicated gene has an elevated substitution rate. In a representative dataset of 193 mammalian genomes, we found strong evidence that gene copies pasted into new genomic locations tend to evolve faster than the corresponding copies in ancestral locations, suggesting an important role for the regulatory environment. The asymmetry in evolutionary rates of duplicated genes persists even for very large multigenic duplications, up to the scale of megabases, indicating that regulatory interactions frequently reach farther than previously thought.more » « less
An official website of the United States government
