skip to main content


Title: Regulatory dissection of the severe COVID-19 risk locus introgressed by Neanderthals
Individuals infected with the SARS-CoV-2 virus present with a wide variety of symptoms ranging from asymptomatic to severe and even lethal outcomes. Past research has revealed a genetic haplotype on chromosome 3 that entered the human population via introgression from Neanderthals as the strongest genetic risk factor for the severe response to COVID-19. However, the specific variants along this introgressed haplotype that contribute to this risk and the biological mechanisms that are involved remain unclear. Here, we assess the variants present on the risk haplotype for their likelihood of driving the genetic predisposition to severe COVID-19 outcomes. We do this by first exploring their impact on the regulation of genes involved in COVID-19 infection using a variety of population genetics and functional genomics tools. We then perform a locus-specific massively parallel reporter assay to individually assess the regulatory potential of each allele on the haplotype in a multipotent immune-related cell line. We ultimately reduce the set of over 600 linked genetic variants to identify four introgressed alleles that are strong functional candidates for driving the association between this locus and severe COVID-19. Using reporter assays in the presence/absence of SARS-CoV-2 , we find evidence that these variants respond to viral infection. These variants likely drive the locus’ impact on severity by modulating the regulation of two critical chemokine receptor genes: CCR1 and CCR5 . These alleles are ideal targets for future functional investigations into the interaction between host genomics and COVID-19 outcomes.  more » « less
Award ID(s):
2020205
NSF-PAR ID:
10451390
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ; ; ;
Date Published:
Journal Name:
eLife
Volume:
12
ISSN:
2050-084X
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Falush, Daniel (Ed.)
    Abstract Although some variation introgressed from Neanderthals has undergone selective sweeps, little is known about its functional significance. We used a Massively Parallel Reporter Assay (MPRA) to assay 5,353 high-frequency introgressed variants for their ability to modulate the gene expression within 170 bp of endogenous sequence. We identified 2,548 variants in active putative cis-regulatory elements (CREs) and 292 expression-modulating variants (emVars). These emVars are predicted to alter the binding motifs of important immune transcription factors, are enriched for associations with neutrophil and white blood cell count, and are associated with the expression of genes that function in innate immune pathways including inflammatory response and antiviral defense. We combined the MPRA data with other data sets to identify strong candidates to be driver variants of positive selection including an emVar that may contribute to protection against severe COVID-19 response. We endogenously deleted two CREs containing expression-modulation variants linked to immune function, rs11624425 and rs80317430, identifying their primary genic targets as ELMSAN1, and PAN2 and STAT2, respectively, three genes differentially expressed during influenza infection. Overall, we present the first database of experimentally identified expression-modulating Neanderthal-introgressed alleles contributing to potential immune response in modern humans. 
    more » « less
  2. SARS-CoV-2 has caused symptomatic COVID-19 and widespread death across the globe. We sought to determine genetic variants contributing to COVID-19 susceptibility and hospitalization in a large biobank linked to a national United States health system. We identified 19,168 (3.7%) lab-confirmed COVID-19 cases among Million Veteran Program participants between March 1, 2020, and February 2, 2021, including 11,778 Whites, 4,893 Blacks, and 2,497 Hispanics. A multi-population genome-wide association study (GWAS) for COVID-19 outcomes identified four independent genetic variants (rs8176719, rs73062389, rs60870724, and rs73910904) contributing to COVID-19 positivity, including one novel locus found exclusively among Hispanics. We replicated eight of nine previously reported genetic associations at an alpha of 0.05 in at least one population-specific or the multi-population meta-analysis for one of the four MVP COVID-19 outcomes. We used rs8176719 and three additional variants to accurately infer ABO blood types. We found that A, AB, and B blood types were associated with testing positive for COVID-19 compared with O blood type with the highest risk for the A blood group. We did not observe any genome-wide significant associations for COVID-19 severity outcomes among those testing positive. Our study replicates prior GWAS findings associated with testing positive for COVID-19 among mostly White samples and extends findings at three loci to Black and Hispanic individuals. We also report a new locus among Hispanics requiring further investigation. These findings may aid in the identification of novel therapeutic agents to decrease the morbidity and mortality of COVID-19 across all major ancestral populations. 
    more » « less
  3. Abstract The human angiotensin-converting enzyme 2 (ACE2) and transmembrane serine protease 2 (TMPRSS2) proteins play key roles in the cellular internalization of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the coronavirus responsible for the coronavirus disease of 2019 (COVID-19) pandemic. We set out to functionally characterize the ACE2 and TMPRSS2 protein abundance for variant alleles encoding these proteins that contained non-synonymous single-nucleotide polymorphisms (nsSNPs) in their open reading frames (ORFs). Specifically, a high-throughput assay, deep mutational scanning (DMS), was employed to test the functional implications of nsSNPs, which are variants of uncertain significance in these two genes. Specifically, we used a ‘landing pad’ system designed to quantify the protein expression for 433 nsSNPs that have been observed in the ACE2 and TMPRSS2 ORFs and found that 8 of 127 ACE2, 19 of 157 TMPRSS2 isoform 1 and 13 of 149 TMPRSS2 isoform 2 variant proteins displayed less than ~25% of the wild-type protein expression, whereas 4 ACE2 variants displayed 25% or greater increases in protein expression. As a result, we concluded that nsSNPs in genes encoding ACE2 and TMPRSS2 might potentially influence SARS-CoV-2 infectivity. These results can now be applied to DNA sequence data for patients infected with SARS-CoV-2 to determine the possible impact of patient-based DNA sequence variation on the clinical course of SARS-CoV-2 infection. 
    more » « less
  4. null (Ed.)
    The novel coronavirus severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is the cause of COVID-19. The main receptor of SARS-CoV-2, angiotensin I converting enzyme 2 (ACE2), is now undergoing extensive scrutiny to understand the routes of transmission and sensitivity in different species. Here, we utilized a unique dataset of ACE2 sequences from 410 vertebrate species, including 252 mammals, to study the conservation of ACE2 and its potential to be used as a receptor by SARS-CoV-2. We designed a five-category binding score based on the conservation properties of 25 amino acids important for the binding between ACE2 and the SARS-CoV-2 spike protein. Only mammals fell into the medium to very high categories and only catarrhine primates into the very high category, suggesting that they are at high risk for SARS-CoV-2 infection. We employed a protein structural analysis to qualitatively assess whether amino acid changes at variable residues would be likely to disrupt ACE2/SARS-CoV-2 spike protein binding and found the number of predicted unfavorable changes significantly correlated with the binding score. Extending this analysis to human population data, we found only rare (frequency <0.001) variants in 10/25 binding sites. In addition, we found significant signals of selection and accelerated evolution in the ACE2 coding sequence across all mammals, and specific to the bat lineage. Our results, if confirmed by additional experimental data, may lead to the identification of intermediate host species for SARS-CoV-2, guide the selection of animal models of COVID-19, and assist the conservation of animals both in native habitats and in human care. 
    more » « less
  5. Abstract

    The COVID-19 pandemic, caused by the coronavirus SARS-CoV-2, has resulted in the loss of millions of lives and severe global economic consequences. Every time SARS-CoV-2 replicates, the viruses acquire new mutations in their genomes. Mutations in SARS-CoV-2 genomes led to increased transmissibility, severe disease outcomes, evasion of the immune response, changes in clinical manifestations and reducing the efficacy of vaccines or treatments. To date, the multiple resources provide lists of detected mutations without key functional annotations. There is a lack of research examining the relationship between mutations and various factors such as disease severity, pathogenicity, patient age, patient gender, cross-species transmission, viral immune escape, immune response level, viral transmission capability, viral evolution, host adaptability, viral protein structure, viral protein function, viral protein stability and concurrent mutations. Deep understanding the relationship between mutation sites and these factors is crucial for advancing our knowledge of SARS-CoV-2 and for developing effective responses. To fill this gap, we built COV2Var, a function annotation database of SARS-CoV-2 genetic variation, available at http://biomedbdc.wchscu.cn/COV2Var/. COV2Var aims to identify common mutations in SARS-CoV-2 variants and assess their effects, providing a valuable resource for intensive functional annotations of common mutations among SARS-CoV-2 variants.

     
    more » « less