skip to main content

Title: Epidemiological associations with genomic variation in SARS-CoV-2

SARS-CoV-2 (CoV) is the etiological agent of the COVID-19 pandemic and evolves to evade both host immune systems and intervention strategies. We divided the CoV genome into 29 constituent regions and applied novel analytical approaches to identify associations between CoV genomic features and epidemiological metadata. Our results show that nonstructural protein 3 (nsp3) and Spike protein (S) have the highest variation and greatest correlation with the viral whole-genome variation. S protein variation is correlated with nsp3, nsp6, and 3′-to-5′ exonuclease variation. Country of origin and time since the start of the pandemic were the most influential metadata associated with genomic variation, while host sex and age were the least influential. We define a novel statistic—coherence—and show its utility in identifying geographic regions (populations) with unusually high (many new variants) or low (isolated) viral phylogenetic diversity. Interestingly, at both global and regional scales, we identify geographic locations with high coherence neighboring regions of low coherence; this emphasizes the utility of this metric to inform public health measures for disease spread. Our results provide a direction to prioritize genes associated with outcome predictors (e.g., health, therapeutic, and vaccine outcomes) and to improve DNA tests for predicting disease status.

; ; ; ; ;
Award ID(s):
2028280 2109688
Publication Date:
Journal Name:
Scientific Reports
Nature Publishing Group
Sponsoring Org:
National Science Foundation
More Like this
  1. Cimarelli, Andrea (Ed.)
    The Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) infection causes Coronavirus Disease 2019 (COVID-19), a pandemic that seriously threatens global health. SARS-CoV-2 propagates by packaging its RNA genome into membrane enclosures in host cells. The packaging of the viral genome into the nascent virion is mediated by the nucleocapsid (N) protein, but the underlying mechanism remains unclear. Here, we show that the N protein forms biomolecular condensates with viral genomic RNA both in vitro and in mammalian cells. While the N protein forms spherical assemblies with homopolymeric RNA substrates that do not form base pairing interactions, it forms asymmetric condensates with viral RNA strands. Cross-linking mass spectrometry (CLMS) identified a region that drives interactions between N proteins in condensates, and deletion of this region disrupts phase separation. We also identified small molecules that alter the size and shape of N protein condensates and inhibit the proliferation of SARS-CoV-2 in infected cells. These results suggest that the N protein may utilize biomolecular condensation to package the SARS-CoV-2 RNA genome into a viral particle.
  2. Abstract

    The ongoing COVID-19 pandemic highlights the necessity for a more fundamental understanding of the coronavirus life cycle. The causative agent of the disease, SARS-CoV-2, is being studied extensively from a structural standpoint in order to gain insight into key molecular mechanisms required for its survival. Contained within the untranslated regions of the SARS-CoV-2 genome are various conserved stem-loop elements that are believed to function in RNA replication, viral protein translation, and discontinuous transcription. While the majority of these regions are variable in sequence, a 41-nucleotide s2m element within the genome 3′ untranslated region is highly conserved among coronaviruses and three other viral families. In this study, we demonstrate that the SARS-CoV-2 s2m element dimerizes by forming an intermediate homodimeric kissing complex structure that is subsequently converted to a thermodynamically stable duplex conformation. This process is aided by the viral nucleocapsid protein, potentially indicating a role in mediating genome dimerization. Furthermore, we demonstrate that the s2m element interacts with multiple copies of host cellular microRNA (miRNA) 1307-3p. Taken together, our results highlight the potential significance of the dimer structures formed by the s2m element in key biological processes and implicate the motif as a possible therapeutic drug target for COVID-19more »and other coronavirus-related diseases.

    « less
  3. Lee, Benhur (Ed.)
    ABSTRACT Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has infected over 40 million people worldwide, with over 1 million deaths as of October 2020 and with multiple efforts in the development and testing of antiviral drugs and vaccines under way. In order to gain insights into SARS-CoV-2 evolution and drug targets, we investigated how and to what extent the SARS-CoV-2 genome sequence differs from those of other well-characterized human and animal coronavirus genomes, as well as how polymorphic SARS-CoV-2 genomes are generally. We ultimately sought to identify features in the SARS-CoV-2 genome that may contribute to its viral replication, host pathogenicity, and vulnerabilities. Our analyses suggest the presence of unique sequence signatures in the 3′ untranslated region (3′-UTR) of betacoronavirus lineage B, which phylogenetically encompasses SARS-CoV-2 and SARS-CoV as well as multiple groups of bat and animal coronaviruses. In addition, we identified genome-wide patterns of variation across different SARS-CoV-2 strains that likely reflect the effects of selection. Finally, we provide evidence for a possible host-microRNA-mediated interaction between the 3′-UTR and human microRNA hsa-miR-1307-3p based on the results of multiple computational target prediction analyses and an assessment of similar interactions involving the influenza A H1N1 virus. This interaction also suggests amore »possible survival mechanism, whereby a mutation in the SARS-CoV-2 3′-UTR leads to a weakened host immune response. The potential roles of host microRNAs in SARS-CoV-2 replication and infection and the exploitation of conserved features in the 3′-UTR as therapeutic targets warrant further investigation. IMPORTANCE The coronavirus disease 2019 (COVID-19) outbreak is having a dramatic global effect on public health and the economy. As of October 2020, SARS-CoV-2 has been detected in over 189 countries, has infected over 40 million people, and is responsible for more than 1 million deaths. The genome of SARS-CoV-2 is small but complex, and its functions and interactions with human host factors are being studied extensively. The significance of our study is that, using extensive SARS-CoV-2 genome analysis techniques, we identified potential interacting human host microRNA targets that share similarity with those of influenza A virus H1N1. Our study results will allow the development of virus-host interaction models that will enhance our understanding of SARS-CoV-2 pathogenesis and motivate the exploitation of both the interacting viral and host factors as therapeutic targets.« less
  4. Prasad, Vinayaka R. (Ed.)
    ABSTRACT The ongoing coronavirus disease 2019 (COVID-19) pandemic demonstrates the threat posed by novel coronaviruses to human health. Coronaviruses share a highly conserved cell entry mechanism mediated by the spike protein, the sole product of the S gene. The structural dynamics by which the spike protein orchestrates infection illuminate how antibodies neutralize virions and how S mutations contribute to viral fitness. Here, we review the process by which spike engages its proteinaceous receptor, angiotensin converting enzyme 2 (ACE2), and how host proteases prime and subsequently enable efficient membrane fusion between virions and target cells. We highlight mutations common among severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) variants of concern and discuss implications for cell entry. Ultimately, we provide a model by which sarbecoviruses are activated for fusion competency and offer a framework for understanding the interplay between humoral immunity and the molecular evolution of the SARS-CoV-2 Spike. In particular, we emphasize the relevance of the Canyon Hypothesis (M. G. Rossmann, J Biol Chem 264:14587–14590, 1989) for understanding evolutionary trajectories of viral entry proteins during sustained intraspecies transmission of a novel viral pathogen.
  5. The transmission and evolution of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) are of paramount importance in controlling and combating the coronavirus disease 2019 (COVID-19) pandemic. Currently, over 15,000 SARS-CoV-2 single mutations have been recorded, which have a great impact on the development of diagnostics, vaccines, antibody therapies, and drugs. However, little is known about SARS-CoV-2’s evolutionary characteristics and general trend. In this work, we present a comprehensive genotyping analysis of existing SARS-CoV-2 mutations. We reveal that host immune response via APOBEC and ADAR gene editing gives rise to near 65% of recorded mutations. Additionally, we show that children under age five and the elderly may be at high risk from COVID-19 because of their overreaction to the viral infection. Moreover, we uncover that populations of Oceania and Africa react significantly more intensively to SARS-CoV-2 infection than those of Europe and Asia, which may explain why African Americans were shown to be at increased risk of dying from COVID-19, in addition to their high risk of COVID-19 infection caused by systemic health and social inequities. Finally, our study indicates that for two viral genome sequences of the same origin, their evolution order may be determined from the ratio of mutationmore »type, C > T over T > C.« less