skip to main content


Title: Evolution of the SARS‐CoV ‐2 proteome in three dimensions (3D) during the first 6 months of the COVID ‐19 pandemic
Abstract

Understanding the molecular evolution of the SARS‐CoV‐2 virus as it continues to spread in communities around the globe is important for mitigation and future pandemic preparedness. Three‐dimensional structures of SARS‐CoV‐2 proteins and those of other coronavirusess archived in the Protein Data Bank were used to analyze viral proteome evolution during the first 6 months of the COVID‐19 pandemic. Analyses of spatial locations, chemical properties, and structural and energetic impacts of the observed amino acid changes in >48 000 viral isolates revealed how each one of 29 viral proteins have undergone amino acid changes. Catalytic residues in active sites and binding residues in protein–protein interfaces showed modest, but significant, numbers of substitutions, highlighting the mutational robustness of the viral proteome. Energetics calculations showed that the impact of substitutions on the thermodynamic stability of the proteome follows a universal bi‐Gaussian distribution. Detailed results are presented for potential drug discovery targets and the four structural proteins that comprise the virion, highlighting substitutions with the potential to impact protein structure, enzyme activity, and protein–protein and protein–nucleic acid interfaces. Characterizing the evolution of the virus in three dimensions provides testable insights into viral protein function and should aid in structure‐based drug discovery efforts as well as the prospective identification of amino acid substitutions with potential for drug resistance.

 
more » « less
Award ID(s):
1709170 1709278 1832184 1929237
NSF-PAR ID:
10445500
Author(s) / Creator(s):
 ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  more » ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;   « less
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
Proteins: Structure, Function, and Bioinformatics
Volume:
90
Issue:
5
ISSN:
0887-3585
Page Range / eLocation ID:
p. 1054-1080
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. SARSNTdb offers a curated, nucleotide-centric database for users of varying levels of SARS-CoV-2 knowledge. Its user-friendly interface enables querying coding regions and coordinate intervals to find out the various functional and selective constraints that act upon the corresponding nucleotides and amino acids. Users can easily obtain information about viral genes and proteins, functional domains, repeats, secondary structure formation, intragenomic interactions, and mutation prevalence. Currently, many databases are focused on the phylogeny and amino acid substitutions, mainly in the spike protein. We took a novel, more nucleotide-focused approach as RNA does more than just code for proteins and many insights can be gleaned from its study. For example, RNA-targeted drug therapies for SARS-CoV-2 are currently being developed and it is essential to understand the features only visible at that level. This database enables the user to identify regions that are more prone to forming secondary structures that drugs can target. SARSNTdb also provides illustrative mutation data from a subset of ~25,000 patient samples with a reliable read coverage across the whole genome (from different locations and time points in the pandemic. Finally, the database allows for comparing SARS-CoV-2 and SARS-CoV domains and sequences. SARSNTdb can serve the research community by being a curated repository for information that gives a jump start to analyze a mutation’s effect far beyond just determining synonymous/non-synonymous substitutions in protein sequences. 
    more » « less
  2. The rapid spread of SARS-CoV-2 required immediate actions to control the transmission of the virus and minimize its impact on humanity. An extensive mutation rate of this viral genome contributes to the virus’ ability to quickly adapt to environmental changes, impacts transmissibility and antigenicity, and may facilitate immune escape. Therefore, it is of great interest for researchers working in vaccine development and drug design to consider the impact of mutations on virus-drug interactions. Here, we propose a multitarget drug discovery pipeline for identifying potential drug candidates which can efficiently inhibit the Receptor Binding Domain (RBD) of spike glycoproteins from different variants of SARS-CoV-2. Eight homology models of RBDs for selected variants were created and validated using reference crystal structures. We then investigated interactions between host receptor ACE2 and RBDs from nine variants of SARS-CoV-2. It led us to conclude that efficient multi-variant targeting drugs should be capable of blocking residues Q(R)493 and N487 in RBDs. Using methods of molecular docking, molecular mechanics, and molecular dynamics, we identified three lead compounds (hesperidin, narirutin, and neohesperidin) suitable for multitarget SARS-CoV-2 inhibition. These compounds are flavanone glycosides found in citrus fruits – an active ingredient of Traditional Chinese Medicines. The developed pipeline can be further used to (1) model mutants for which crystal structures are not yet available and (2) scan a more extensive library of compounds against other mutated viral proteins. 
    more » « less
  3. Abstract

    The glycosylation on the spike (S) protein of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the virus that causes COVID-19, modulates the viral infection by altering conformational dynamics, receptor interaction and host immune responses. Several variants of concern (VOCs) of SARS-CoV-2 have evolved during the pandemic, and crucial mutations on the S protein of the virus have led to increased transmissibility and immune escape. In this study, we compare the site-specific glycosylation and overall glycomic profiles of the wild type Wuhan-Hu-1 strain (WT) S protein and five VOCs of SARS-CoV-2: Alpha, Beta, Gamma, Delta and Omicron. Interestingly, both N- and O-glycosylation sites on the S protein are highly conserved among the spike mutant variants, particularly at the sites on the receptor-binding domain (RBD). The conservation of glycosylation sites is noteworthy, as over 2 million SARS-CoV-2 S protein sequences have been reported with various amino acid mutations. Our detailed profiling of the glycosylation at each of the individual sites of the S protein across the variants revealed intriguing possible association of glycosylation pattern on the variants and their previously reported infectivity. While the sites are conserved, we observed changes in the N- and O-glycosylation profile across the variants. The newly emerged variants, which showed higher resistance to neutralizing antibodies and vaccines, displayed a decrease in the overall abundance of complex-type glycans with both fucosylation and sialylation and an increase in the oligomannose-type glycans across the sites. Among the variants, the glycosylation sites with significant changes in glycan profile were observed at both theN-terminal domain and RBD of S protein, with Omicron showing the highest deviation. The increase in oligomannose-type happens sequentially from Alpha through Delta. Interestingly, Omicron does not contain more oligomannose-type glycans compared to Delta but does contain more compared to the WT and other VOCs. O-glycosylation at the RBD showed lower occupancy in the VOCs in comparison to the WT. Our study on the sites and pattern of glycosylation on the SARS-CoV-2 S proteins across the VOCs may help to understand how the virus evolved to trick the host immune system. Our study also highlights how the SARS-CoV-2 virus has conserved bothN- andO- glycosylation sites on the S protein of the most successful variants even after undergoing extensive mutations, suggesting a correlation between infectivity/ transmissibility and glycosylation.

     
    more » « less
  4. SARS-CoV-2, the cause of COVID-19, is a new, highly pathogenic coronavirus, which is the third coronavirus to emerge in the past 2 decades and the first to become a global pandemic. The virus has demonstrated itself to be extremely transmissible and deadly. Recent data suggest that a targeted approach is key to mitigating infectivity. Due to the proliferation of cataloged protein and nucleic acid sequences in databases, the function of the nucleic acid, and genetic encoded proteins, we make predictions by simply aligning sequences and exploring their homology. Thus, similar amino acid sequences in a protein usually confer similar biochemical function, even from distal or unrelated organisms. To understand viral transmission and adhesion, it is key to elucidate the structural, surface, and functional properties of each viral protein. This is typically first modeled in highly pathogenic species by exploring folding, hydrophobicity, and isoelectric point (IEP). Recent evidence from viral RNA sequence modeling and protein crystals have been inadequate, which prevent full understanding of the IEP and other viral properties of SARS-CoV-2. We have thus experimentally determined the IEP of SARS-CoV-2. Our findings suggest that for enveloped viruses, such as SARS-CoV-2, estimates of IEP by the amino acid sequence alone may be unreliable. We compared the experimental IEP of SARS-CoV-2 to variants of interest (VOIs) using their amino acid sequence, thus providing a qualitative comparison of the IEP of VOIs. 
    more » « less
  5. Abstract

    The continued emergence of new SARS‐CoV‐2 variants has accentuated the growing need for fast and reliable methods for the design of potentially neutralizing antibodies (Abs) to counter immune evasion by the virus. Here, we report on the de novo computational design of high‐affinity Ab variable regions (Fv) through the recombination of VDJ genes targeting the most solvent‐exposed hACE2‐binding residues of the SARS‐CoV‐2 spike receptor binding domain (RBD) protein using the software toolOptMAVEn‐2.0. Subsequently, we carried out computational affinity maturation of the designed variable regions through amino acid substitutions for improved binding with the target epitope. Immunogenicity of designs was restricted by preferring designs that match sequences from a 9‐mer library of “human Abs” based on a human string content score. We generated 106 different antibody designs and reported in detail on the top five that trade‐off the greatest computational binding affinity for the RBD with human string content scores. We further describe computational evaluation of the top five designs produced byOptMAVEn‐2.0using a Rosetta‐based approach. We used RosettaSnugDockfor local docking of the designs to evaluate their potential to bind the spike RBD and performed “forward folding” withDeepAbto assess their potential to fold into the designed structures. Ultimately, our results identified one designed Ab variable region, P1.D1, as a particularly promising candidate for experimental testing. This effort puts forth a computational workflow for the de novo design and evaluation of Abs that can quickly be adapted to target spike epitopes of emerging SARS‐CoV‐2 variants or other antigenic targets.

     
    more » « less