skip to main content


Title: Detecting genetic epistasis by differential departure from independence
Countering prior beliefs that epistasis is rare, genomics advancements suggest the other way. Current practice often filters out genomic loci with low variant counts before detecting epistasis. We argue that this practice is far from optimal because it can throw away strong epistatic patterns. Instead, we present the compensated Sharma–Song test to infer genetic epistasis in genome-wide association studies by differential departure from independence. The test does not require a minimum number of replicates for each variant. We also introduce algorithms to simulate epistatic patterns that differentially depart from independence. Using two simulators, the test performed comparably to the original Sharma–Song test when variant frequencies at a locus are marginally uniform; encouragingly, it has a marked advantage over alternatives when variant frequencies are marginally nonuniform. The test further revealed uniquely clean epistatic variants associated with chicken abdominal fat content that are not prioritized by other methods. Genes involved in most numbers of inferred epistasis between single nucleotide polymorphisms (SNPs) belong to pathways known for obesity regulation; many top SNPs are located on chromosome 20 and in intergenic regions. Measuring differential departure from independence, the compensated Sharma–Song test offers a practical choice for studying epistasis robust to nonuniform genetic variant frequencies.  more » « less
Award ID(s):
1661331
NSF-PAR ID:
10334201
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Molecular Genetics and Genomics
ISSN:
1617-4615
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Kelso, Janet (Ed.)
    Abstract Motivation Genetic or epigenetic events can rewire molecular networks to induce extraordinary phenotypical divergences. Among the many network rewiring approaches, no model-free statistical methods can differentiate gene-gene pattern changes not attributed to marginal changes. This may obscure fundamental rewiring from superficial changes. Results Here we introduce a model-free Sharma-Song test to determine if patterns differ in the second order, meaning that the deviation of the joint distribution from the product of marginal distributions is unequal across conditions. We prove an asymptotic chi-squared null distribution for the test statistic. Simulation studies demonstrate its advantage over alternative methods in detecting second-order differential patterns. Applying the test on three independent mammalian developmental transcriptome datasets, we report a lower frequency of co-expression network rewiring between human and mouse for the same tissue group than the frequency of rewiring between tissue groups within the same species. We also find secondorder differential patterns between microRNA promoters and genes contrasting cerebellum and liver development in mice. These patterns are enriched in the spliceosome pathway regulating tissue specificity. Complementary to previous mammalian comparative studies mostly driven by first-order effects, our findings contribute an understanding of system-wide second-order gene network rewiring within and across mammalian systems. Second-order differential patterns constitute evidence for fundamentally rewired biological circuitry due to evolution, environment, or disease. Availability The generic Sharma-Song test is available from the R package ‘DiffXTables’ at https://cran.r-project.org/package=DiffXTables. Other code and data are described in Methods. Supplementary information Supplementary data are available at Bioinformatics online. 
    more » « less
  2. Abstract

    The Omicron BA.1 variant emerged in late 2021 and quickly spread across the world. Compared to the earlier SARS-CoV-2 variants, BA.1 has many mutations, some of which are known to enable antibody escape. Many of these antibody-escape mutations individually decrease the spike receptor-binding domain (RBD) affinity for ACE2, but BA.1 still binds ACE2 with high affinity. The fitness and evolution of the BA.1 lineage is therefore driven by the combined effects of numerous mutations. Here, we systematically map the epistatic interactions between the 15 mutations in the RBD of BA.1 relative to the Wuhan Hu-1 strain. Specifically, we measure the ACE2 affinity of all possible combinations of these 15 mutations (215 = 32,768 genotypes), spanning all possible evolutionary intermediates from the ancestral Wuhan Hu-1 strain to BA.1. We find that immune escape mutations in BA.1 individually reduce ACE2 affinity but are compensated by epistatic interactions with other affinity-enhancing mutations, including Q498R and N501Y. Thus, the ability of BA.1 to evade immunity while maintaining ACE2 affinity is contingent on acquiring multiple interacting mutations. Our results implicate compensatory epistasis as a key factor driving substantial evolutionary change for SARS-CoV-2 and are consistent with Omicron BA.1 arising from a chronic infection.

     
    more » « less
  3. Abstract

    Variation in the metabolic costs associated with organismal maintenance may play a key role in determining fitness, and thus these differences among individuals are likely to be subject to natural selection. Although the evolvability of maintenance metabolism depends on its underlying genetic architecture, relatively little is known about the nature of genetic variation that underlies this trait. To address this, we measured variation in routine metabolic rate (O2routine), an index of maintenance metabolism, within and among three populations of Atlantic killifish,Fundulus heteroclitus, including a population from a region of genetic admixture between two subspecies. Polygenic association tests among individuals from the admixed population identified 54 single nucleotide polymorphisms (SNPs) that were associated withO2routine, and these SNPs accounted for 43% of interindividual variation in this trait. However, genetic associations withO2routineinvolved different SNPs if females and males were analysed separately, and there was a sex‐dependent effect of mitochondrial genotype on variation in routine metabolism. These results imply that there are sex‐specific genetic mechanisms, and potential mitonuclear interactions, that underlie variation inO2routine. Additionally, there was evidence for epistatic interactions between 17% of the possible pairs of trait‐associated SNPs, suggesting that epistatic effects onO2routineare common. These data demonstrate not only that phenotypic variation in this ecologically important trait has a polygenic basis with considerable epistasis among loci, but also that these underlying genetic mechanisms, and particularly the role of mitochondrial genotype, may be sex‐specific.

     
    more » « less
  4. Abstract

    Epistasis is an evolutionary phenomenon whereby the fitness effect of a mutation depends on the genetic background in which it arises. A key source of epistasis in an RNA molecule is its secondary structure, which contains functionally important topological motifs held together by hydrogen bonds between Watson–Crick (WC) base pairs. Here we study epistasis in the secondary structure of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) by examining properties of derived alleles arising from substitution mutations at ancestral WC base-paired and unpaired (UP) sites in 15 conserved topological motifs across the genome. We uncover fewer derived alleles and lower derived allele frequencies at WC than at UP sites, supporting the hypothesis that modifications to the secondary structure are often deleterious. At WC sites, we also find lower derived allele frequencies for mutations that abolish base pairing than for those that yield G·U “wobbles,” illustrating that weak base pairing can partially preserve the integrity of the secondary structure. Last, we show that WC sites under the strongest epistatic constraint reside in a three-stemmed pseudoknot motif that plays an essential role in programmed ribosomal frameshifting, whereas those under the weakest epistatic constraint are located in 3’ UTR motifs that regulate viral replication and pathogenicity. Our findings demonstrate the importance of epistasis in the evolution of the SARS-CoV-2 secondary structure, as well as highlight putative structural and functional targets of different forms of natural selection.

     
    more » « less
  5. Abstract

    Understanding the consequences of exotic diseases on native forests is important to evolutionary ecology and conservation biology because exotic pathogens have drastically altered US eastern deciduous forests.Cornus floridaL. (flowering dogwood tree) is one such species facing heavy mortality. Characterizing the genetic structure ofC. floridapopulations and identifying the genetic signature of adaptation to dogwood anthracnose (an exotic pathogen responsible for high mortality) remain vital for conservation efforts. By integrating genetic data from genotype by sequencing (GBS) of 289 trees across the host species range and distribution of disease, we evaluated the spatial patterns of genetic variation and population genetic structure ofC. floridaand compared the pattern to the distribution of dogwood anthracnose. Using genome‐wide association study and gradient forest analysis, we identified genetic loci under selection and associated with ecological and diseased regions. The results revealed signals of weak genetic differentiation of three or more subgroups nested within two clusters—explaining up to 2%–6% of genetic variation. The groups largely corresponded to the regions within and outside the eastern Hot‐Continental ecoregion, which also overlapped with areas within and outside the main distribution of dogwood anthracnose. The fungal sequences contained in the GBS data of sampled trees bolstered visual records of disease at sampled locations and were congruent with the reported range ofDiscula destructiva, suggesting that fungal sequences within‐host genomic data were informative for detecting or predicting disease. The genetic diversity between populations at diseased vs. disease‐free sites across the range ofC. floridashowed no significant difference. We identified 72 single‐nucleotide polymorphisms (SNPs) from 68 loci putatively under selection, some of which exhibited abrupt turnover in allele frequencies along the borders of the Hot‐Continental ecoregion and the range of dogwood anthracnose. One such candidate SNP was independently identified in two prior studies as a possible L‐type lectin‐domain containing receptor kinase. Although diseased and disease‐free areas do not significantly differ in genetic diversity, overall there are slight trends to indicate marginally smaller amounts of genetic diversity in disease‐affected areas. Our results were congruent with previous studies that were based on a limited number of genetic markers in revealing high genetic variation and weak population structure inC. florida.

     
    more » « less