skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: The lingering effects of Neanderthal introgression on human complex traits
The genetic variants introduced into the ancestors of modern humans from interbreeding with Neanderthals have been suggested to contribute an unexpected extent to complex human traits. However, testing this hypothesis has been challenging due to the idiosyncratic population genetic properties of introgressed variants. We developed rigorous methods to assess the contribution of introgressed Neanderthal variants to heritable trait variation and applied these methods to analyze 235,592 introgressed Neanderthal variants and 96 distinct phenotypes measured in about 300,000 unrelated white British individuals in the UK Biobank. Introgressed Neanderthal variants make a significant contribution to trait variation (explaining 0.12% of trait variation on average). However, the contribution of introgressed variants tends to be significantly depleted relative to modern human variants matched for allele frequency and linkage disequilibrium (about 59% depletion on average), consistent with purifying selection on introgressed variants. Different from previous studies (McArthur et al., 2021), we find no evidence for elevated heritability across the phenotypes examined. We identified 348 independent significant associations of introgressed Neanderthal variants with 64 phenotypes. Previous work (Skov et al., 2020) has suggested that a majority of such associations are likely driven by statistical association with nearby modern human variants that are the true causal variants. Applying a customized fine-mapping led us to identify 112 regions across 47 phenotypes containing 4303 unique genetic variants where introgressed variants are highly likely to have a phenotypic effect. Examination of these variants reveals their substantial impact on genes that are important for the immune system, development, and metabolism.  more » « less
Award ID(s):
1943497
PAR ID:
10423504
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ;
Date Published:
Journal Name:
eLife
Volume:
12
ISSN:
2050-084X
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Falush, Daniel (Ed.)
    Abstract Although some variation introgressed from Neanderthals has undergone selective sweeps, little is known about its functional significance. We used a Massively Parallel Reporter Assay (MPRA) to assay 5,353 high-frequency introgressed variants for their ability to modulate the gene expression within 170 bp of endogenous sequence. We identified 2,548 variants in active putative cis-regulatory elements (CREs) and 292 expression-modulating variants (emVars). These emVars are predicted to alter the binding motifs of important immune transcription factors, are enriched for associations with neutrophil and white blood cell count, and are associated with the expression of genes that function in innate immune pathways including inflammatory response and antiviral defense. We combined the MPRA data with other data sets to identify strong candidates to be driver variants of positive selection including an emVar that may contribute to protection against severe COVID-19 response. We endogenously deleted two CREs containing expression-modulation variants linked to immune function, rs11624425 and rs80317430, identifying their primary genic targets as ELMSAN1, and PAN2 and STAT2, respectively, three genes differentially expressed during influenza infection. Overall, we present the first database of experimentally identified expression-modulating Neanderthal-introgressed alleles contributing to potential immune response in modern humans. 
    more » « less
  2. Abstract Classical genetic studies have identified many cases of pleiotropy where mutations in individual genes alter many different phenotypes. Quantitative genetic studies of natural genetic variants frequently examine one or a few traits, limiting their potential to identify pleiotropic effects of natural genetic variants. Widely adopted community association panels have been employed by plant genetics communities to study the genetic basis of naturally occurring phenotypic variation in a wide range of traits. High-density genetic marker data—18M markers—from 2 partially overlapping maize association panels comprising 1,014 unique genotypes grown in field trials across at least 7 US states and scored for 162 distinct trait data sets enabled the identification of of 2,154 suggestive marker-trait associations and 697 confident associations in the maize genome using a resampling-based genome-wide association strategy. The precision of individual marker-trait associations was estimated to be 3 genes based on a reference set of genes with known phenotypes. Examples were observed of both genetic loci associated with variation in diverse traits (e.g., above-ground and below-ground traits), as well as individual loci associated with the same or similar traits across diverse environments. Many significant signals are located near genes whose functions were previously entirely unknown or estimated purely via functional data on homologs. This study demonstrates the potential of mining community association panel data using new higher-density genetic marker sets combined with resampling-based genome-wide association tests to develop testable hypotheses about gene functions, identify potential pleiotropic effects of natural genetic variants, and study genotype-by-environment interaction. 
    more » « less
  3. IntroductionAutoimmune disorders (ADs) are a group of about 80 disorders that occur when self-attacking autoantibodies are produced due to failure in the self-tolerance mechanisms. ADs are polygenic disorders and associations with genes both in the human leukocyte antigen (HLA) region and outside of it have been described. Previous studies have shown that they are highly comorbid with shared genetic risk factors, while epidemiological studies revealed associations between various lifestyle and health-related phenotypes and ADs. MethodsHere, for the first time, we performed a comparative polygenic risk score (PRS) - Phenome Wide Association Study (PheWAS) for 11 different ADs (Juvenile Idiopathic Arthritis, Primary Sclerosing Cholangitis, Celiac Disease, Multiple Sclerosis, Rheumatoid Arthritis, Psoriasis, Myasthenia Gravis, Type 1 Diabetes, Systemic Lupus Erythematosus, Vitiligo Late Onset, Vitiligo Early Onset) and 3,254 phenotypes available in the UK Biobank that include a wide range of socio-demographic, lifestyle and health-related outcomes. Additionally, we investigated the genetic relationships of the studied ADs, calculating their genetic correlation and conducting cross-disorder GWAS meta-analyses for the observed AD clusters. ResultsIn total, we identified 508 phenotypes significantly associated with at least one AD PRS. 272 phenotypes were significantly associated after excluding variants in the HLA region from the PRS estimation. Through genetic correlation and genetic factor analyses, we identified four genetic factors that run across studied ADs. Cross-trait meta-analyses within each factor revealed pleiotropic genome-wide significant loci. DiscussionOverall, our study confirms the association of different factors with genetic susceptibility for ADs and reveals novel observations that need to be further explored. 
    more » « less
  4. Our knowledge of the contribution of genetic interactions (epistasis) to variation in human complex traits remains limited, partly due to the lack of efficient, powerful, and interpretable algorithms to detect interactions. Recently proposed approaches for set-based association tests show promise in improving the power to detect epistasis by examining the aggregated effects of multiple variants. Nevertheless, these methods either do not scale to large Biobank data sets or lack interpretability. We propose QuadKAST, a scalable algorithm focused on testing pairwise interaction effects (quadratic effects) within small to medium-sized sets of genetic variants (window size ≤100) on a trait and provide quantified interpretation of these effects. Comprehensive simulations show that QuadKAST is well-calibrated. Additionally, QuadKAST is highly sensitive in detecting loci with epistatic signals and accurate in its estimation of quadratic effects. We applied QuadKAST to 52 quantitative phenotypes measured in ≈300,000 unrelated white British individuals in the UK Biobank to test for quadratic effects within each of 9515 protein-coding genes. We detect 32 trait-gene pairs across 17 traits and 29 genes that demonstrate statistically significant signals of quadratic effects (accounting for the number of genes and traits tested). Across these trait-gene pairs, the proportion of trait variance explained by quadratic effects is comparable to additive effects, with five pairs having a ratio >1. Our method enables the detailed investigation of epistasis on a large scale, offering new insights into its role and importance. 
    more » « less
  5. Context: Stalk lodging causes up to 43 % of yield losses in maize (Zea mays L.) worldwide, significantly worsening food and feed shortages. Stalk lodging resistance is a complex trait specified by several structural, material, and geometric phenotypes. However, the identity, relative contribution, and genetic tractability of these intermediate phenotypes remain unknown. Objective: The study is designed to identify and evaluate plant-, organ-, and tissue-level intermediate phenotypes associated with stalk lodging resistance following standardized phenotyping protocols and to understand the variation and genetic tractability of these intermediate phenotypes. Methods: We examined 16 diverse maize hybrids in two environments to identify and evaluate intermediate phenotypes associated with stalk flexural stiffness, a reliable indicator of stalk lodging resistance, at physiological maturity. Engineering-informed and machine learning models were employed to understand relationships among intermediate phenotypes and stalk flexural stiffness. Results: Stalk flexural stiffness showed significant genetic variation and high heritability (0.64) in the evaluated hybrids. Significant genetic variation and comparable heritability for the cross-sectional moment of inertia and Young’s modulus indicated that geometric and material properties are under tight genetic control and play a combinatorial role in determining stalk lodging resistance. Among the twelve internode-level traits measured on the bottom and the ear internode, most traits exhibited significant genetic variation among hybrids, moderate to high heritability, and considerable effect of genotype × environment interaction. The marginal statistical model based on structural engineering beam theory revealed that 74–80 % of the phenotypic variation for flexural stiffness was explained by accounting for the major diameter, minor diameter, and rind thickness of the stalks. The machine learning model explained a relatively modest proportion (58–62 %) of the variation for flexural stiffness. 
    more » « less