skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Predicting binding affinities of emerging variants of SARS-CoV-2 using spike protein sequencing data: observations, caveats and recommendations
Abstract Predicting protein properties from amino acid sequences is an important problem in biology and pharmacology. Protein–protein interactions among SARS-CoV-2 spike protein, human receptors and antibodies are key determinants of the potency of this virus and its ability to evade the human immune response. As a rapidly evolving virus, SARS-CoV-2 has already developed into many variants with considerable variation in virulence among these variants. Utilizing the proteomic data of SARS-CoV-2 to predict its viral characteristics will, therefore, greatly aid in disease control and prevention. In this paper, we review and compare recent successful prediction methods based on long short-term memory (LSTM), transformer, convolutional neural network (CNN) and a similarity-based topological regression (TR) model and offer recommendations about appropriate predictive methodology depending on the similarity between training and test datasets. We compare the effectiveness of these models in predicting the binding affinity and expression of SARS-CoV-2 spike protein sequences. We also explore how effective these predictive methods are when trained on laboratory-created data and are tasked with predicting the binding affinity of the in-the-wild SARS-CoV-2 spike protein sequences obtained from the GISAID datasets. We observe that TR is a better method when the sample size is small and test protein sequences are sufficiently similar to the training sequence. However, when the training sample size is sufficiently large and prediction requires extrapolation, LSTM embedding and CNN-based predictive model show superior performance.  more » « less
Award ID(s):
2007903
PAR ID:
10356369
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Briefings in Bioinformatics
Volume:
23
Issue:
3
ISSN:
1467-5463
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract The glycosylation on the spike (S) protein of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the virus that causes COVID-19, modulates the viral infection by altering conformational dynamics, receptor interaction and host immune responses. Several variants of concern (VOCs) of SARS-CoV-2 have evolved during the pandemic, and crucial mutations on the S protein of the virus have led to increased transmissibility and immune escape. In this study, we compare the site-specific glycosylation and overall glycomic profiles of the wild type Wuhan-Hu-1 strain (WT) S protein and five VOCs of SARS-CoV-2: Alpha, Beta, Gamma, Delta and Omicron. Interestingly, both N- and O-glycosylation sites on the S protein are highly conserved among the spike mutant variants, particularly at the sites on the receptor-binding domain (RBD). The conservation of glycosylation sites is noteworthy, as over 2 million SARS-CoV-2 S protein sequences have been reported with various amino acid mutations. Our detailed profiling of the glycosylation at each of the individual sites of the S protein across the variants revealed intriguing possible association of glycosylation pattern on the variants and their previously reported infectivity. While the sites are conserved, we observed changes in the N- and O-glycosylation profile across the variants. The newly emerged variants, which showed higher resistance to neutralizing antibodies and vaccines, displayed a decrease in the overall abundance of complex-type glycans with both fucosylation and sialylation and an increase in the oligomannose-type glycans across the sites. Among the variants, the glycosylation sites with significant changes in glycan profile were observed at both theN-terminal domain and RBD of S protein, with Omicron showing the highest deviation. The increase in oligomannose-type happens sequentially from Alpha through Delta. Interestingly, Omicron does not contain more oligomannose-type glycans compared to Delta but does contain more compared to the WT and other VOCs. O-glycosylation at the RBD showed lower occupancy in the VOCs in comparison to the WT. Our study on the sites and pattern of glycosylation on the SARS-CoV-2 S proteins across the VOCs may help to understand how the virus evolved to trick the host immune system. Our study also highlights how the SARS-CoV-2 virus has conserved bothN- andO- glycosylation sites on the S protein of the most successful variants even after undergoing extensive mutations, suggesting a correlation between infectivity/ transmissibility and glycosylation. 
    more » « less
  2. Abstract SARS-CoV-2, especially B.1.1.529/omicron and its sublineages, continues to mutate to evade monoclonal antibodies and antibodies elicited by vaccination. Affinity-enhanced soluble ACE2 (sACE2) is an alternative strategy that works by binding the SARS-CoV-2 S protein, acting as a ‘decoy’ to block the interaction between the S and human ACE2. Using a computational design strategy, we designed an affinity-enhanced ACE2 decoy,FLIF, that exhibited tight binding to SARS-CoV-2 delta and omicron variants. Our computationally calculated absolute binding free energies (ABFE) between sACE2:SARS-CoV-2 S proteins and their variants showed excellent agreement to binding experiments.FLIFdisplayed robust therapeutic utility against a broad range of SARS-CoV-2 variants and sarbecoviruses, and neutralized omicron BA.5 in vitro and in vivo. Furthermore, we directly compared the in vivo therapeutic efficacy of wild-type ACE2 (non-affinity enhanced ACE2) againstFLIF. A few wild-type sACE2 decoys have shown to be effective against early circulating variants such as Wuhan in vivo. Our data suggest that moving forward, affinity-enhanced ACE2 decoys likeFLIFmay be required to combat evolving SARS-CoV-2 variants. The approach described herein emphasizes how computational methods have become sufficiently accurate for the design of therapeutics against viral protein targets. Affinity-enhanced ACE2 decoys remain highly effective at neutralizing omicron subvariants. 
    more » « less
  3. While the COVID-19 pandemic continues to worsen, effective medicines that target the life cycle of SARS-CoV-2 are still under development. As more highly infective and dangerous variants of the coronavirus emerge, the protective power of vaccines will decrease or vanish. Thus, the development of drugs, which are free of drug resistance is direly needed. The aim of this study is to identify allosteric binding modulators from a large compound library to inhibit the binding between the Spike protein of the SARS-CoV-2 virus and human angiotensin-converting enzyme 2 (hACE2). The binding of the Spike protein to hACE2 is the first step of the infection of host cells by the coronavirus. We first built a compound library containing 77 448 antiviral compounds. Molecular docking was then conducted to preliminarily screen compounds which can potently bind to the Spike protein at two allosteric binding sites. Next, molecular dynamics simulations were performed to accurately calculate the binding affinity between the spike protein and an identified compound from docking screening and to investigate whether the compound can interfere with the binding between the Spike protein and hACE2. We successfully identified two possible drug binding sites on the Spike protein and discovered a series of antiviral compounds which can weaken the interaction between the Spike protein and hACE2 receptor through conformational changes of the key Spike residues at the Spike–hACE2 binding interface induced by the binding of the ligand at the allosteric binding site. We also applied our screening protocol to another compound library which consists of 3407 compounds for which the inhibitory activities of Spike/hACE2 binding were measured. Encouragingly, in vitro data supports that the identified compounds can inhibit the Spike–ACE2 binding. Thus, we developed a promising computational protocol to discover allosteric inhibitors of the binding of the Spike protein of SARS-CoV-2 to the hACE2 receptor, and several promising allosteric modulators were discovered. 
    more » « less
  4. Although COVID-19 transmission has been reduced by the advent of vaccinations and a variety of rapid monitoring techniques, the SARS-CoV-2 virus itself has shown a remarkable ability to mutate and persist. With this long track record of immune escape, researchers are still exploring prophylactic treatments to curtail future SARS-CoV-2 variants. Specifically, much focus has been placed on the antiviral lectin Griffithsin in preventing spike protein-mediated infection via the hACE2 receptor (direct infection). However, an oft-overlooked aspect of SARS-CoV-2 infection is viral capture by attachment receptors such as DC-SIGN, which is thought to facilitate the initial stages of COVID-19 infection in the lung tissue (called trans-infection). In addition, while immune escape is dictated by mutations in the spike protein, coronaviral virions also incorporate M, N, and E structural proteins within the particle. In this paper, we explored how several structural facets of both the SARS-CoV-2 virion and the antiviral lectin Griffithsin can affect and attenuate the infectivity of SARS-CoV-2 pseudovirus. We found that Griffithsin was a better inhibitor of hACE2-mediated direct infection when the coronaviral M protein is present compared to when it is absent (possibly providing an explanation regarding why Griffithsin shows better inhibition against authentic SARS-CoV-2 as opposed to pseudotyped viruses, which generally do not contain M) and that Griffithsin was not an effective inhibitor of DC-SIGN-mediated trans-infection. Furthermore, we found that DC-SIGN appeared to mediate trans-infection exclusively via binding to the SARS-CoV-2 spike protein, with no significant effect observed when other viral proteins (M, N, and/or E) were present. These results provide etiological data that may help to direct the development of novel antiviral treatments, either by leveraging Griffithsin binding to the M protein as a novel strategy to prevent SARS-CoV-2 infection or by narrowing efforts to inhibit trans-infection to focus on DC-SIGN binding to SARS-CoV-2 spike protein. 
    more » « less
  5. null (Ed.)
    Abstract SARS-CoV-2 variants with spike (S)-protein D614G mutations now predominate globally. We therefore compare the properties of the mutated S protein (S G614 ) with the original (S D614 ). We report here pseudoviruses carrying S G614 enter ACE2-expressing cells more efficiently than those with S D614 . This increased entry correlates with less S1-domain shedding and higher S-protein incorporation into the virion. Similar results are obtained with virus-like particles produced with SARS-CoV-2 M, N, E, and S proteins. However, D614G does not alter S-protein binding to ACE2 or neutralization sensitivity of pseudoviruses. Thus, D614G may increase infectivity by assembling more functional S protein into the virion. 
    more » « less