skip to main content


Title: Problems with Paralogs: The Promise and Challenges of Gene Duplicates in Evo-Devo Research
Synopsis

Gene duplicates, or paralogs, serve as a major source of new genetic material and comprise seeds for evolutionary innovation. While originally thought to be quickly lost or nonfunctionalized following duplication, now a vast number of paralogs are known to be retained in a functional state. Daughter paralogs can provide robustness through redundancy, specialize via sub-functionalization, or neo-functionalize to play new roles. Indeed, the duplication and divergence of developmental genes have played a monumental role in the evolution of animal forms (e.g., Hox genes). Still, despite their prevalence and evolutionary importance, the precise detection of gene duplicates in newly sequenced genomes remains technically challenging and often overlooked. This presents an especially pertinent problem for evolutionary developmental biology, where hypothesis testing requires accurate detection of changes in gene expression and function, often in nontraditional model species. Frequently, these analyses rely on molecular reagents designed within coding sequences that may be highly similar in recently duplicated paralogs, leading to cross-reactivity and spurious results. Thus, care is needed to avoid erroneously assigning diverged functions of paralogs to a single gene, and potentially misinterpreting evolutionary history. This perspective aims to overview the prevalence and importance of paralogs and to shed light on the difficulty of their detection and analysis while offering potential solutions.

 
more » « less
Award ID(s):
2305817
PAR ID:
10542418
Author(s) / Creator(s):
;
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
Integrative And Comparative Biology
Volume:
64
Issue:
2
ISSN:
1540-7063
Format(s):
Medium: X Size: p. 556-564
Size(s):
p. 556-564
Sponsoring Org:
National Science Foundation
More Like this
  1. Van De Peer, Yves (Ed.)
    Abstract

    Analyses in a number of organisms have shown that duplicated genes are less likely to be essential than singletons. This implies that genes can often compensate for the loss of their paralogs. However, it is unclear why the loss of some duplicates can be compensated by their paralogs, whereas the loss of other duplicates cannot. Surprisingly, initial analyses in mice did not detect differences in the essentiality of duplicates and singletons. Only subsequent analyses, using larger gene knockout data sets and controlling for a number of confounding factors, did detect significant differences. Previous studies have not taken into account the tissues in which duplicates are expressed. We hypothesized that in complex organisms, in order for a gene’s loss to be compensated by one or more of its paralogs, such paralogs need to be expressed in at least the same set of tissues as the lost gene. To test our hypothesis, we classified mouse duplicates into two categories based on the expression patterns of their paralogs: “compensable duplicates” (those with paralogs expressed in all the tissues in which the gene is expressed) and “noncompensable duplicates” (those whose paralogs are not expressed in all the tissues where the gene is expressed). In agreement with our hypothesis, the essentiality of noncompensable duplicates is similar to that of singletons, whereas compensable duplicates exhibit a substantially lower essentiality. Our results imply that duplicates can often compensate for the loss of their paralogs, but only if they are expressed in the same tissues. Indeed, the compensation ability is more dependent on expression patterns than on protein sequence similarity. The existence of these two kinds of duplicates with different essentialities, which has been overlooked by prior studies, may have hindered the detection of differences between singletons and duplicates.

     
    more » « less
  2. Wittkopp, Patricia (Ed.)
    Abstract Whole-genome duplications (WGDs) have occurred in many eukaryotic lineages. However, the underlying evolutionary forces and molecular mechanisms responsible for the long-term retention of gene duplicates created by WGDs are not well understood. We employ a population-genomic approach to understand the selective forces acting on paralogs and investigate ongoing duplicate-gene loss in multiple species of Paramecium that share an ancient WGD. We show that mutations that abolish protein function are more likely to be segregating in retained WGD paralogs than in single-copy genes, most likely because of ongoing nonfunctionalization post-WGD. This relaxation of purifying selection occurs in only one WGD paralog, accompanied by the gradual fixation of nonsynonymous mutations and reduction in levels of expression, and occurs over a long period of evolutionary time, “marking” one locus for future loss. Concordantly, the fitness effects of new nonsynonymous mutations and frameshift-causing indels are significantly more deleterious in the highly expressed copy compared with their paralogs with lower expression. Our results provide a novel mechanistic model of gene duplicate loss following WGDs, wherein selection acts on the sum of functional activity of both duplicate genes, allowing the two to wander in expression and functional space, until one duplicate locus eventually degenerates enough in functional efficiency or expression that its contribution to total activity is too insignificant to be retained by purifying selection. Retention of duplicates by such mechanisms predicts long times to duplicate-gene loss, which should not be falsely attributed to retention due to gain/change in function. 
    more » « less
  3. Summary

    Gene duplication is a powerful source of biological innovation giving rise to paralogous genes that undergo diverse fates. Redundancy between paralogous genes is an intriguing outcome of duplicate gene evolution, and its maintenance over evolutionary time has long been considered a paradox. Redundancy can also be dubbed ‘a geneticist's nightmare’: It hinders the predictability of genome editing outcomes and limits our ability to link genotypes to phenotypes. Genetic studies in yeast and plants have suggested that the ability of ancient redundant duplicates to compensate for dosage perturbations resulting from a loss of function depends on the reprogramming of gene expression, a phenomenon known as active compensation. Starting from considerations on the stoichiometric constraints that drive the evolutionary stability of redundancy, this review aims to provide insights into the mechanisms of active compensation between duplicates that could be targeted for breaking paralog dependencies – the next frontier in plant functional studies.

     
    more » « less
  4. Abstract

    A whole‐genome duplication (WGD) doubles the entire genomic content of a species and is thought to have catalysed adaptive radiation in some polyploid‐origin lineages. However, little is known about general consequences of aWGDbecause gene duplicates (i.e., paralogs) are commonly filtered in genomic studies; such filtering may remove substantial portions of the genome in data sets from polyploid‐origin species. We demonstrate a new method that enables genome‐wide scans for signatures of selection at both nonduplicated and duplicated loci by taking locus‐specific copy number into account. We apply this method toRADsequence data from different ecotypes of a polyploid‐origin salmonid (Oncorhynchus nerka) and reveal signatures of divergent selection that would have been missed if duplicated loci were filtered. We also find conserved signatures of elevated divergence at pairs of homeologous chromosomes with residual tetrasomic inheritance, suggesting that joint evolution of some nondiverged gene duplicates may affect the adaptive potential of these genes. These findings illustrate that including duplicated loci in genomic analyses enables novel insights into the evolutionary consequences ofWGDs and local segmental gene duplications.

     
    more » « less
  5. Synopsis

    The proliferation of genomic resources for Chelicerata in the past 10 years has revealed that the evolution of chelicerate genomes is more dynamic than previously thought, with multiple waves of ancient whole genome duplications affecting separate lineages. Such duplication events are fascinating from the perspective of evolutionary history because the burst of new gene copies associated with genome duplications facilitates the acquisition of new gene functions (neofunctionalization), which may in turn lead to morphological novelties and spur net diversification. While neofunctionalization has been invoked in several contexts with respect to the success and diversity of spiders, the overall impact of whole genome duplications on chelicerate evolution and development remains imperfectly understood. The purpose of this review is to examine critically the role of whole genome duplication on the diversification of the extant arachnid orders, as well as assess functional datasets for evidence of subfunctionalization or neofunctionalization in chelicerates. This examination focuses on functional data from two focal model taxa: the spider Parasteatoda tepidariorum, which exhibits evidence for an ancient duplication, and the harvestman Phalangium opilio, which exhibits an unduplicated genome. I show that there is no evidence that taxa with genome duplications are more successful than taxa with unduplicated genomes. I contend that evidence for sub- or neofunctionalization of duplicated developmental patterning genes in spiders is indirect or fragmentary at present, despite the appeal of this postulate for explaining the success of groups like spiders. Available expression data suggest that the condition of duplicated Hox modules may have played a role in promoting body plan disparity in the posterior tagma of some orders, such as spiders and scorpions, but functional data substantiating this postulate are critically missing. Spatiotemporal dynamics of duplicated transcription factors in spiders may represent cases of developmental system drift, rather than neofunctionalization. Developmental system drift may represent an important, but overlooked, null hypothesis for studies of paralogs in chelicerate developmental biology. To distinguish between subfunctionalization, neofunctionalization, and developmental system drift, concomitant establishment of comparative functional datasets from taxa exhibiting the genome duplication, as well as those that lack the paralogy, is sorely needed.

     
    more » « less