skip to main content


Title: TOWARDS INTERPRETING ZOONOTIC POTENTIAL OF BETACORONAVIRUS SEQUENCES WITH ATTENTION
Current methods for viral discovery target evolutionarily conserved proteins that accurately identify virus families but remain unable to distinguish the zoonotic potential of newly discovered viruses. Here, we apply an attention-enhanced longshort- term memory (LSTM) deep neural net classifier to a highly conserved viral protein target to predict zoonotic potential across betacoronaviruses. The classifier performs with a 94% accuracy. Analysis and visualization of attention at the sequence and structure-level features indicate possible association between important protein-protein interactions governing viral replication in zoonotic betacoronaviruses and zoonotic transmission.  more » « less
Award ID(s):
1717282
NSF-PAR ID:
10350220
Author(s) / Creator(s):
; ; ; ; ; ;
Date Published:
Journal Name:
ICLR
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The s2m, a highly conserved 41-nt hairpin structure in the SARS-CoV-2 genome, serves as an attractive therapeutic target that may have important roles in the virus life cycle or interactions with the host. However, the conserved s2m in Delta SARS-CoV-2, a previously dominant variant characterized by high infectivity and disease severity, has received relatively less attention than that of the original SARS-CoV-2 virus. The focus of this work is to identify and define the s2m changes between Delta and SARS-CoV-2 and the subsequent impact of those changes upon the s2m dimerization and interactions with the host microRNA miR-1307-3p. Bioinformatics analysis of the GISAID database targeting the s2m element reveals a >99% correlation of a single nucleotide mutation at the 15th position (G15U) in Delta SARS-CoV-2. Based on1H NMR spectroscopy assignments comparing the imino proton resonance region of s2m and the s2m G15U at 19°C, we show that the U15–A29 base pair closes, resulting in a stabilization of the upper stem without overall secondary structure deviation. Increased stability of the upper stem did not affect the chaperone activity of the viral N protein, as it was still able to convert the kissing dimers formed by s2m G15U into a stable duplex conformation, consistent with the s2m reference. However, we show that the s2m G15U mutation drastically impacts the binding of host miR-1307-3p. These findings demonstrate that the observed G15U mutation alters the secondary structure of s2m with subsequent impact on viral binding of host miR-1307-3p, with potential consequences on immune responses.

     
    more » « less
  2. null (Ed.)
    The bovine immune system is known for its unusual traits relating to immunoglobulin and antiviral responses. Peptidylarginine deiminases (PADs) are phylogenetically conserved enzymes that cause post-translational deimination, contributing to protein moonlighting in health and disease. PADs also regulate extracellular vesicle (EV) release, forming a critical part of cellular communication. As PAD-mediated mechanisms in bovine immunology and physiology remain to be investigated, this study profiled deimination signatures in serum and serum-EVs in Bos taurus. Bos EVs were poly-dispersed in a 70–500 nm size range and showed differences in deiminated protein cargo, compared with whole sera. Key immune, metabolic and gene regulatory proteins were identified to be post-translationally deiminated with some overlapping hits in sera and EVs (e.g., immunoglobulins), while some were unique to either serum or serum-EVs (e.g., histones). Protein–protein interaction network analysis of deiminated proteins revealed KEGG pathways common for serum and serum-EVs, including complement and coagulation cascades, viral infection (enveloped viruses), viral myocarditis, bacterial and parasitic infections, autoimmune disease, immunodeficiency intestinal IgA production, B-cell receptor signalling, natural killer cell mediated cytotoxicity, platelet activation and hematopoiesis, alongside metabolic pathways including ferroptosis, vitamin digestion and absorption, cholesterol metabolism and mineral absorption. KEGG pathways specific to EVs related to HIF-1 signalling, oestrogen signalling and biosynthesis of amino acids. KEGG pathways specific for serum only, related to Epstein–Barr virus infection, transcription mis-regulation in cancer, bladder cancer, Rap1 signalling pathway, calcium signalling pathway and ECM-receptor interaction. This indicates differences in physiological and pathological pathways for deiminated proteins in serum-EVs, compared with serum. Our findings may shed light on pathways underlying a number of pathological and anti-pathogenic (viral, bacterial, parasitic) pathways, with putative translatable value to human pathologies, zoonotic diseases and development of therapies for infections, including anti-viral therapies. 
    more » « less
  3. Abstract

    The ongoing COVID-19 pandemic highlights the necessity for a more fundamental understanding of the coronavirus life cycle. The causative agent of the disease, SARS-CoV-2, is being studied extensively from a structural standpoint in order to gain insight into key molecular mechanisms required for its survival. Contained within the untranslated regions of the SARS-CoV-2 genome are various conserved stem-loop elements that are believed to function in RNA replication, viral protein translation, and discontinuous transcription. While the majority of these regions are variable in sequence, a 41-nucleotide s2m element within the genome 3′ untranslated region is highly conserved among coronaviruses and three other viral families. In this study, we demonstrate that the SARS-CoV-2 s2m element dimerizes by forming an intermediate homodimeric kissing complex structure that is subsequently converted to a thermodynamically stable duplex conformation. This process is aided by the viral nucleocapsid protein, potentially indicating a role in mediating genome dimerization. Furthermore, we demonstrate that the s2m element interacts with multiple copies of host cellular microRNA (miRNA) 1307-3p. Taken together, our results highlight the potential significance of the dimer structures formed by the s2m element in key biological processes and implicate the motif as a possible therapeutic drug target for COVID-19 and other coronavirus-related diseases.

     
    more » « less
  4. Abstract

    Replication of the coronavirus genome starts with the formation of viral RNA-containing double-membrane vesicles (DMV) following viral entry into the host cell. The multi-domain nonstructural protein 3 (nsp3) is the largest protein encoded by the known coronavirus genome and serves as a central component of the viral replication and transcription machinery. Previous studies demonstrated that the highly-conserved C-terminal region of nsp3 is essential for subcellular membrane rearrangement, yet the underlying mechanisms remain elusive. Here we report the crystal structure of the CoV-Y domain, the most C-terminal domain of the SARS-CoV-2 nsp3, at 2.4 Å-resolution. CoV-Y adopts a previously uncharacterized V-shaped fold featuring three distinct subdomains. Sequence alignment and structure prediction suggest that this fold is likely shared by the CoV-Y domains from closely related nsp3 homologs. NMR-based fragment screening combined with molecular docking identifies surface cavities in CoV-Y for interaction with potential ligands and other nsps. These studies provide the first structural view on a complete nsp3 CoV-Y domain, and the molecular framework for understanding the architecture, assembly and function of the nsp3 C-terminal domains in coronavirus replication. Our work illuminates nsp3 as a potential target for therapeutic interventions to aid in the on-going battle against the COVID-19 pandemic and diseases caused by other coronaviruses.

     
    more » « less
  5. Back and forth transmission of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) between humans and animals will establish wild reservoirs of virus that endanger long-term efforts to control COVID-19 in people and to protect vulnerable animal populations. Better targeting surveillance and laboratory experiments to validate zoonotic potential requires predicting high-risk host species. A major bottleneck to this effort is the few species with available sequences for angiotensin-converting enzyme 2 receptor, a key receptor required for viral cell entry. We overcome this bottleneck by combining species' ecological and biological traits with three-dimensional modelling of host-virus protein–protein interactions using machine learning. This approach enables predictions about the zoonotic capacity of SARS-CoV-2 for greater than 5000 mammals—an order of magnitude more species than previously possible. Our predictions are strongly corroborated by in vivo studies. The predicted zoonotic capacity and proximity to humans suggest enhanced transmission risk from several common mammals, and priority areas of geographic overlap between these species and global COVID-19 hotspots. With molecular data available for only a small fraction of potential animal hosts, linking data across biological scales offers a conceptual advance that may expand our predictive modelling capacity for zoonotic viruses with similarly unknown host ranges. 
    more » « less