skip to main content


The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 11:00 PM ET on Friday, April 12 until 2:00 AM ET on Saturday, April 13 due to maintenance. We apologize for the inconvenience.

Title: An in silico deep learning approach to multi-epitope vaccine design: a SARS-CoV-2 case study

The rampant spread of COVID-19, an infectious disease caused by SARS-CoV-2, all over the world has led to over millions of deaths, and devastated the social, financial and political entities around the world. Without an existing effective medical therapy, vaccines are urgently needed to avoid the spread of this disease. In this study, we propose an in silico deep learning approach for prediction and design of a multi-epitope vaccine (DeepVacPred). By combining the in silico immunoinformatics and deep neural network strategies, the DeepVacPred computational framework directly predicts 26 potential vaccine subunits from the available SARS-CoV-2 spike protein sequence. We further use in silico methods to investigate the linear B-cell epitopes, Cytotoxic T Lymphocytes (CTL) epitopes, Helper T Lymphocytes (HTL) epitopes in the 26 subunit candidates and identify the best 11 of them to construct a multi-epitope vaccine for SARS-CoV-2 virus. The human population coverage, antigenicity, allergenicity, toxicity, physicochemical properties and secondary structure of the designed vaccine are evaluated via state-of-the-art bioinformatic approaches, showing good quality of the designed vaccine. The 3D structure of the designed vaccine is predicted, refined and validated by in silico tools. Finally, we optimize and insert the codon sequence into a plasmid to ensure the cloning and expression efficiency. In conclusion, this proposed artificial intelligence (AI) based vaccine discovery framework accelerates the vaccine design process and constructs a 694aa multi-epitope vaccine containing 16 B-cell epitopes, 82 CTL epitopes and 89 HTL epitopes, which is promising to fight the SARS-CoV-2 viral infection and can be further evaluated in clinical studies. Moreover, we trace the RNA mutations of the SARS-CoV-2 and ensure that the designed vaccine can tackle the recent RNA mutations of the virus.

more » « less
Award ID(s):
Author(s) / Creator(s):
; ;
Publisher / Repository:
Nature Publishing Group
Date Published:
Journal Name:
Scientific Reports
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The COVID-19 pandemic caused by SARS-CoV-2 sparked intensive research into the development of effective vaccines, 50 of which have been approved thus far, including the novel mRNA-based vaccines developed by Pfizer and Moderna. Although limiting the severity of the disease, the mRNA-based vaccines presented drawbacks, such as the cold chain requirement. Moreover, antibody levels generated by these vaccines decline significantly after 6 months. These vaccines deliver mRNA encoding the full-length spike (S) glycoprotein of SARS-CoV-2, but must be updated as new strains and variants of concern emerge, creating a demand for adjusted formulations and booster campaigns. To overcome these challenges, we have developed COVID-19 vaccine candidates based on the highly conserved SARS CoV-2, 809-826 B-cell peptide epitope (denoted 826) conjugated to cowpea mosaic virus (CPMV) nanoparticles and bacteriophage Qβ virus-like particles, both platforms have exceptional thermal stability and facilitate epitope delivery with inbuilt adjuvant activity. We evaluated two administration methods: subcutaneous injection and an implantable polymeric scaffold. Mice received a prime–boost regimen of 100 μg per dose (2 weeks apart) or a single dose of 200 μg administered as a liquid formulation, or a polymer implant. Antibody titers were evaluated longitudinally over 50 weeks. The vaccine candidates generally elicited an early Th2-biased immune response, which stimulates the production of SARS-CoV-2 neutralizing antibodies, followed by a switch to a Th1-biased response for most formulations. Exceptionally, vaccine candidate 826-CPMV (administered as prime-boost, soluble injection) elicited a balanced Th1/Th2 immune response, which is necessary to prevent pulmonary immunopathology associated with Th2 bias extremes. While the Qβ-based vaccine elicited overall higher antibody titers, the CPMV-induced antibodies had higher avidity. Regardless of the administration route and formulation, our vaccine candidates maintained high antibody titers for more than 50 weeks, confirming a potent and durable immune response against SARS-CoV-2 even after a single dose. 
    more » « less
  2. Abstract SARS-CoV-2 worldwide spread and evolution has resulted in variants containing mutations resulting in immune evasive epitopes that decrease vaccine efficacy. We acquired SARS-CoV-2 positive clinical samples and compared the worldwide emerged spike mutations from Variants of Concern/Interest, and developed an algorithm for monitoring the evolution of SARS-CoV-2 in the context of vaccines and monoclonal antibodies. The algorithm partitions logarithmic-transformed prevalence data monthly and Pearson’s correlation determines exponential emergence of amino acid substitutions (AAS) and lineages. The SARS-CoV-2 genome evaluation indicated 49 mutations, with 44 resulting in AAS. Nine of the ten most worldwide prevalent (>70%) spike protein changes have Pearson’s coefficient r  > 0.9. The tenth, D614G, has a prevalence >99% and r -value of 0.67. The resulting algorithm is based on the patterns these ten substitutions elucidated. The strong positive correlation of the emerged spike protein changes and algorithmic predictive value can be harnessed in designing vaccines with relevant immunogenic epitopes. Monitoring, next-generation vaccine design, and mAb clinical efficacy must keep up with SARS-CoV-2 evolution, as the virus is predicted to remain endemic. 
    more » « less
  3. Ozkan, Banu (Ed.)
    Abstract Evaluation of immunogenic epitopes for universal vaccine development in the face of ongoing SARS-CoV-2 evolution remains a challenge. Herein, we investigate the genetic and structural conservation of an immunogenically relevant epitope (C662–C671) of spike (S) protein across SARS-CoV-2 variants to determine its potential utility as a broad-spectrum vaccine candidate against coronavirus diseases. Comparative sequence analysis, structural assessment, and molecular dynamics simulations of C662–C671 epitope were performed. Mathematical tools were employed to determine its mutational cost. We found that the amino acid sequence of C662–C671 epitope is entirely conserved across the observed major variants of SARS-CoV-2 in addition to SARS-CoV. Its conformation and accessibility are predicted to be conserved, even in the highly mutated Omicron variant. Costly mutational rate in the context of energy expenditure in genome replication and translation can explain this strict conservation. These observations may herald an approach to developing vaccine candidates for universal protection against emergent variants of coronavirus. 
    more » « less
  4. Abstract

    As the SARS-CoV-2 pandemic is rapidly progressing, the need for the development of an effective vaccine is critical. A promising approach for vaccine development is to generate, through codon pair deoptimization, an attenuated virus. This approach carries the advantage that it only requires limited knowledge specific to the virus in question, other than its genome sequence. Therefore, it is well suited for emerging viruses, for which we may not have extensive data. We performed comprehensive in silico analyses of several features of SARS-CoV-2 genomic sequence (e.g., codon usage, codon pair usage, dinucleotide/junction dinucleotide usage, RNA structure around the frameshift region) in comparison with other members of the coronaviridae family of viruses, the overall human genome, and the transcriptome of specific human tissues such as lung, which are primarily targeted by the virus. Our analysis identified the spike (S) and nucleocapsid (N) proteins as promising targets for deoptimization and suggests a roadmap for SARS-CoV-2 vaccine development, which can be generalizable to other viruses.

    more » « less
  5. Abstract

    The emergence of severe acute respiratory syndrome coronavirus 2 (SARS‐CoV‐2) variants of concern (VOC) has raised questions regarding vaccine protection against SARS‐CoV‐2 infection, transmission, and ongoing virus evolution. Twenty‐three mildly symptomatic “vaccination breakthrough” infections were identified as early as January 2021 in Alachua County, Florida, among individuals fully vaccinated with either the BNT162b2 (Pfizer) or the Ad26 (Janssen/J&J) vaccines. SARS‐CoV‐2 genomes were successfully generated for 11 of the vaccine breakthroughs, and 878 individuals in the surrounding area and were included for reference‐based phylogenetic investigation. These 11 individuals were characterized by infection with VOCs, but also low‐frequency variants present within the surrounding population. Low‐frequency mutations were observed, which have been more recently identified as mutations of interest owing to their location within targeted immune epitopes (P812L) and association with increased replicative capacity (L18F). We present these results to posit the nature of the efficacy of vaccines in reducing symptoms as both a blessing and a curse—as vaccination becomes more widespread and self‐motivated testing reduced owing to the absence of severe symptoms, we face the challenge of early recognition of novel mutations of potential concern. This case study highlights the critical need for continued testing and monitoring of infection and transmission among individuals regardless of vaccination status.

    more » « less