skip to main content


Title: DNA methylation signatures of duplicate gene evolution in angiosperms
Abstract

Gene duplication is a source of evolutionary novelty. DNA methylation may play a role in the evolution of duplicate genes (paralogs) through its association with gene expression. While this relationship has been examined to varying extents in a few individual species, the generalizability of these results at either a broad phylogenetic scale with species of differing duplication histories or across a population remains unknown. We applied a comparative epigenomic approach to 43 angiosperm species across the phylogeny and a population of 928 Arabidopsis (Arabidopsis thaliana) accessions, examining the association of DNA methylation with paralog evolution. Genic DNA methylation was differentially associated with duplication type, the age of duplication, sequence evolution, and gene expression. Whole-genome duplicates were typically enriched for CG-only gene body methylated or unmethylated genes, while single-gene duplications were typically enriched for non-CG methylated or unmethylated genes. Non-CG methylation, in particular, was a characteristic of more recent single-gene duplicates. Core angiosperm gene families were differentiated into those which preferentially retain paralogs and “duplication-resistant” families, which convergently reverted to singletons following duplication. Duplication-resistant families that still have paralogous copies were, uncharacteristically for core angiosperm genes, enriched for non-CG methylation. Non-CG methylated paralogs had higher rates of sequence evolution, higher frequency of presence–absence variation, and more limited expression. This suggests that silencing by non-CG methylation may be important to maintaining dosage following duplication and be a precursor to fractionation. Our results indicate that genic methylation marks differing evolutionary trajectories and fates between paralogous genes and have a role in maintaining dosage following duplication.

 
more » « less
Award ID(s):
2029959
NSF-PAR ID:
10411858
Author(s) / Creator(s):
; ;
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
Plant Physiology
ISSN:
0032-0889
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Summary

    Processes affecting rates of sequence polymorphism are fundamental to the evolution of gene duplicates. The relationship between gene activity and sequence polymorphism can influence the likelihood that functionally redundant gene copies are co‐maintained in stable evolutionary equilibria vs other outcomes such as neofunctionalization.

    Here, we investigate genic variation in epigenome‐associated polymorphism rates inArabidopsis thalianaand consider whether these affect the evolution of gene duplicates. We compared the frequency of sequence polymorphism and patterns of genetic differentiation between genes classified by exon methylation patterns: unmethylated (unM), gene‐body methylated (gbM), and transposon‐like methylated (teM) states, which reflect divergence in gene expression.

    We found that the frequency of polymorphism was higher in teM (transcriptionally repressed, tissue‐specific) genes and lower in gbM (active, constitutively expressed) genes. Comparisons of gene duplicates were largely consistent with genome‐wide patterns – gene copies that exhibit teM accumulate more variation, evolve faster, and are in chromatin states associated with reduced DNA repair.

    This relationship between expression, the epigenome, and polymorphism may lead to the breakdown of equilibrium states that would otherwise maintain genetic redundancies. Epigenome‐mediated polymorphism rate variation may facilitate the evolution of novel gene functions in duplicate paralogs maintained over evolutionary time.

     
    more » « less
  2. Abstract

    A signaling complex comprising members of the LORELEI (LRE)-LIKE GPI-anchored protein (LLG) and Catharanthus roseus RECEPTOR-LIKE KINASE 1-LIKE (CrRLK1L) families perceive RAPID ALKALINIZATION FACTOR (RALF) peptides and regulate growth, reproduction, immunity, and stress responses in Arabidopsis (Arabidopsis thaliana). Genes encoding these proteins are members of multigene families in most angiosperms and could generate thousands of signaling complex variants. However, the links between expansion of these gene families and the functional diversification of this critical signaling complex as well as the evolutionary factors underlying the maintenance of gene duplicates remain unknown. Here, we investigated LLG gene family evolution by sampling land plant genomes and explored the function and expression of angiosperm LLGs. We found that LLG diversity within major land plant lineages is primarily due to lineage-specific duplication events, and that these duplications occurred both early in the history of these lineages and more recently. Our complementation and expression analyses showed that expression divergence (i.e. regulatory subfunctionalization), rather than functional divergence, explains the retention of LLG paralogs. Interestingly, all but one monocot and all eudicot species examined had an LLG copy with preferential expression in male reproductive tissues, while the other duplicate copies showed highest levels of expression in female or vegetative tissues. The single LLG copy in Amborella trichopoda is expressed vastly higher in male compared to in female reproductive or vegetative tissues. We propose that expression divergence plays an important role in retention of LLG duplicates in angiosperms.

     
    more » « less
  3. In plants and mammals, DNA methylation plays a critical role in transcriptional silencing by delineating heterochromatin from transcriptionally active euchromatin. A homeostatic balance between heterochromatin and euchromatin is essential to genomic stability. This is evident in many diseases and mutants for heterochromatin maintenance, which are characterized by global losses of DNA methylation coupled with localized ectopic gains of DNA methylation that alter transcription. Furthermore, we have shown that genome-wide methylation patterns inArabidopsis thalianaare highly stable over generations, with the exception of rare epialleles. However, the extent to which natural variation in the robustness of targeting DNA methylation to heterochromatin exists, and the phenotypic consequences of such variation, remain to be fully explored. Here we describe the finding that heterochromatin and genic DNA methylation are highly variable among 725A. thalianaaccessions. We found that genic DNA methylation is inversely correlated with that in heterochromatin, suggesting that certain methylation pathway(s) may be redirected to genes upon the loss of heterochromatin. This redistribution likely involves a feedback loop involving the DNA methyltransferase, CHROMOMETHYLASE 3 (CMT3), H3K9me2, and histone turnover, as highly expressed, long genes with a high density of CMT3-preferred CWG sites are more likely to be methylated. Importantly, although the presence of CG methylation in genes alone may not affect transcription, genes containing CG methylation are more likely to become methylated at non-CG sites and silenced. These findings are consistent with the hypothesis that natural variation in DNA methylation homeostasis may underlie the evolution of epialleles that alter phenotypes.

     
    more » « less
  4. Wright, S (Ed.)
    Abstract In plants, mammals and insects, some genes are methylated in the CG dinucleotide context, a phenomenon called gene body methylation (gbM). It has been controversial whether this phenomenon has any functional role. Here, we took advantage of the availability of 876 leaf methylomes in Arabidopsis thaliana to characterize the population frequency of methylation at the gene level and to estimate the site-frequency spectrum of allelic states. Using a population genetics model specifically designed for epigenetic data, we found that genes with ancestral gbM are under significant selection to remain methylated. Conversely, ancestrally unmethylated genes were under selection to remain unmethylated. Repeating the analyses at the level of individual cytosines confirmed these results. Estimated selection coefficients were small, on the order of 4 Nes = 1.4, which is similar to the magnitude of selection acting on codon usage. We also estimated that A. thaliana is losing gbM threefold more rapidly than gaining it, which could be due to a recent reduction in the efficacy of selection after a switch to selfing. Finally, we investigated the potential function of gbM through its link with gene expression. Across genes with polymorphic methylation states, the expression of gene body methylated alleles was consistently and significantly higher than unmethylated alleles. Although it is difficult to disentangle genetic from epigenetic effects, our work suggests that gbM has a small but measurable effect on fitness, perhaps due to its association to a phenotype-like gene expression. 
    more » « less
  5. The Arabidopsis DEMETER (DME) DNA glycosylase demethylates the maternal genome in the central cell prior to fertilization and is essential for seed viability. DME preferentially targets small transposons that flank coding genes, influencing their expression and initiating plant gene imprinting. DME also targets intergenic and heterochromatic regions, but how it is recruited to these differing chromatin landscapes is unknown. The C-terminal half of DME consists of 3 conserved regions required for catalysis in vitro. We show that this catalytic core guides active demethylation at endogenous targets, rescuing dme developmental and genomic hypermethylation phenotypes. However, without the N terminus, heterochromatin demethylation is significantly impeded, and abundant CG-methylated genic sequences are ectopically demethylated. Comparative analysis revealed that the conserved DME N-terminal domains are present only in flowering plants, whereas the domain architecture of DME-like proteins in nonvascular plants mainly resembles the catalytic core, suggesting that it might represent the ancestral form of the 5mC DNA glycosylase found in plant lineages. We propose a bipartite model for DME protein action and suggest that the DME N terminus was acquired late during land plant evolution to improve specificity and facilitate demethylation at heterochromatin targets. 
    more » « less