skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Evolution of the SARS‐CoV ‐2 proteome in three dimensions (3D) during the first 6 months of the COVID ‐19 pandemic
Abstract Understanding the molecular evolution of the SARS‐CoV‐2 virus as it continues to spread in communities around the globe is important for mitigation and future pandemic preparedness. Three‐dimensional structures of SARS‐CoV‐2 proteins and those of other coronavirusess archived in the Protein Data Bank were used to analyze viral proteome evolution during the first 6 months of the COVID‐19 pandemic. Analyses of spatial locations, chemical properties, and structural and energetic impacts of the observed amino acid changes in >48 000 viral isolates revealed how each one of 29 viral proteins have undergone amino acid changes. Catalytic residues in active sites and binding residues in protein–protein interfaces showed modest, but significant, numbers of substitutions, highlighting the mutational robustness of the viral proteome. Energetics calculations showed that the impact of substitutions on the thermodynamic stability of the proteome follows a universal bi‐Gaussian distribution. Detailed results are presented for potential drug discovery targets and the four structural proteins that comprise the virion, highlighting substitutions with the potential to impact protein structure, enzyme activity, and protein–protein and protein–nucleic acid interfaces. Characterizing the evolution of the virus in three dimensions provides testable insights into viral protein function and should aid in structure‐based drug discovery efforts as well as the prospective identification of amino acid substitutions with potential for drug resistance.  more » « less
Award ID(s):
1709170 1709278 1832184 1929237
PAR ID:
10445500
Author(s) / Creator(s):
 ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  more » ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;   « less
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
Proteins: Structure, Function, and Bioinformatics
Volume:
90
Issue:
5
ISSN:
0887-3585
Page Range / eLocation ID:
p. 1054-1080
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. SARS-CoV-2, the cause of COVID-19, is a new, highly pathogenic coronavirus, which is the third coronavirus to emerge in the past 2 decades and the first to become a global pandemic. The virus has demonstrated itself to be extremely transmissible and deadly. Recent data suggest that a targeted approach is key to mitigating infectivity. Due to the proliferation of cataloged protein and nucleic acid sequences in databases, the function of the nucleic acid, and genetic encoded proteins, we make predictions by simply aligning sequences and exploring their homology. Thus, similar amino acid sequences in a protein usually confer similar biochemical function, even from distal or unrelated organisms. To understand viral transmission and adhesion, it is key to elucidate the structural, surface, and functional properties of each viral protein. This is typically first modeled in highly pathogenic species by exploring folding, hydrophobicity, and isoelectric point (IEP). Recent evidence from viral RNA sequence modeling and protein crystals have been inadequate, which prevent full understanding of the IEP and other viral properties of SARS-CoV-2. We have thus experimentally determined the IEP of SARS-CoV-2. Our findings suggest that for enveloped viruses, such as SARS-CoV-2, estimates of IEP by the amino acid sequence alone may be unreliable. We compared the experimental IEP of SARS-CoV-2 to variants of interest (VOIs) using their amino acid sequence, thus providing a qualitative comparison of the IEP of VOIs. 
    more » « less
  2. SARSNTdb offers a curated, nucleotide-centric database for users of varying levels of SARS-CoV-2 knowledge. Its user-friendly interface enables querying coding regions and coordinate intervals to find out the various functional and selective constraints that act upon the corresponding nucleotides and amino acids. Users can easily obtain information about viral genes and proteins, functional domains, repeats, secondary structure formation, intragenomic interactions, and mutation prevalence. Currently, many databases are focused on the phylogeny and amino acid substitutions, mainly in the spike protein. We took a novel, more nucleotide-focused approach as RNA does more than just code for proteins and many insights can be gleaned from its study. For example, RNA-targeted drug therapies for SARS-CoV-2 are currently being developed and it is essential to understand the features only visible at that level. This database enables the user to identify regions that are more prone to forming secondary structures that drugs can target. SARSNTdb also provides illustrative mutation data from a subset of ~25,000 patient samples with a reliable read coverage across the whole genome (from different locations and time points in the pandemic. Finally, the database allows for comparing SARS-CoV-2 and SARS-CoV domains and sequences. SARSNTdb can serve the research community by being a curated repository for information that gives a jump start to analyze a mutation’s effect far beyond just determining synonymous/non-synonymous substitutions in protein sequences. 
    more » « less
  3. The COVID-19 pandemic caused by the Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) has spurred unprecedented and concerted worldwide research to curtail and eradicate this pathogen. SARS-CoV-2 has four structural proteins: Envelope (E), Membrane (M), Nucleocapsid (N), and Spike (S), which self-assemble along with its RNA into the infectious virus by budding from intracellular lipid membranes. In this paper, we develop a model to explore the mechanisms of RNA condensation by structural proteins, protein oligomerization and cellular membrane–protein interactions that control the budding process and the ultimate virus structure. Using molecular dynamics simulations, we have deciphered how the positively charged N proteins interact and condense the very long genomic RNA resulting in its packaging by a lipid envelope decorated with structural proteins inside a host cell. Furthermore, considering the length of RNA and the size of the virus, we find that the intrinsic curvature of M proteins is essential for virus budding. While most current research has focused on the S protein, which is responsible for viral entry, and it has been motivated by the need to develop efficacious vaccines, the development of resistance through mutations in this crucial protein makes it essential to elucidate the details of the viral life cycle to identify other drug targets for future therapy. Our simulations will provide insight into the viral life cycle through the assembly of viral particles de novo and potentially identify therapeutic targets for future drug development. 
    more » « less
  4. The rapid spread of SARS-CoV-2 required immediate actions to control the transmission of the virus and minimize its impact on humanity. An extensive mutation rate of this viral genome contributes to the virus’ ability to quickly adapt to environmental changes, impacts transmissibility and antigenicity, and may facilitate immune escape. Therefore, it is of great interest for researchers working in vaccine development and drug design to consider the impact of mutations on virus-drug interactions. Here, we propose a multitarget drug discovery pipeline for identifying potential drug candidates which can efficiently inhibit the Receptor Binding Domain (RBD) of spike glycoproteins from different variants of SARS-CoV-2. Eight homology models of RBDs for selected variants were created and validated using reference crystal structures. We then investigated interactions between host receptor ACE2 and RBDs from nine variants of SARS-CoV-2. It led us to conclude that efficient multi-variant targeting drugs should be capable of blocking residues Q(R)493 and N487 in RBDs. Using methods of molecular docking, molecular mechanics, and molecular dynamics, we identified three lead compounds (hesperidin, narirutin, and neohesperidin) suitable for multitarget SARS-CoV-2 inhibition. These compounds are flavanone glycosides found in citrus fruits – an active ingredient of Traditional Chinese Medicines. The developed pipeline can be further used to (1) model mutants for which crystal structures are not yet available and (2) scan a more extensive library of compounds against other mutated viral proteins. 
    more » « less
  5. Abstract Structure-based drug design targeting the SARS-CoV-2 virus has been greatly facilitated by available virus-related protein structures. However, there is an urgent need for effective, safe small-molecule drugs to control the spread of the virus and variants. While many efforts are devoted to searching for compounds that selectively target individual proteins, we investigated the potential interactions between eight proteins related to SARS-CoV-2 and more than 600 compounds from a traditional Chinese medicine which has proven effective at treating the viral infection. Our original ensemble docking and cooperative docking approaches, followed by a total of over 16-micorsecond molecular simulations, have identified at least 9 compounds that may generally bind to key SARS-CoV-2 proteins. Further, we found evidence that some of these compounds can simultaneously bind to the same target, potentially leading to cooperative inhibition to SARS-CoV-2 proteins like the Spike protein and the RNA-dependent RNA polymerase. These results not only present a useful computational methodology to systematically assess the anti-viral potential of small molecules, but also point out a new avenue to seek cooperative compounds toward cocktail therapeutics to target more SARS-CoV-2-related proteins. 
    more » « less