skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Combining iCn3D and NextStrain to create a novel undergraduate research experience around SARS-CoV-2 variants and commercial antibodies
Undergraduate research experiences are increasingly important in biology education with efforts underway to provide more projects by embedded them in a course. The shift to online learning at the beginning of the pandemic presented a challenge. How could biology instructors provide research experiences to students who were unable to attend in-person labs? During the 2021 ISMB (Intelligent Systems for Molecular Biology) iCn3D Hackathon–Collaborative Tools for Protein Analysis–we learned about new capabilities in iCn3D for analyzing the interactions between amino acids in the paratopes of antibodies with amino acids in the epitopes of antigens and predicting the effects of mutations on binding. Additionally, new sequence alignment tools in iCn3D support aligning protein sequences with sequences in structure models. We used these methods to create a new undergraduate research project, that students could perform online as part of a course, by combining the use of new features in iCn3D with analysis tools in NextStrain, and a data set of anti-SARS-CoV-2 antibodies. We present results from an example project to illustrate how students would investigate the likelihood of SARS-CoV-2 variants escaping from commercial antibodies and use chemical interaction data to support their hypotheses. We also demonstrate that online tools (iCn3D, NextStrain, and the NCBI databases) can be used to carry out the necessary steps and that this work satisfies the requirements for course-based undergraduate research. This project reinforces major concepts in undergraduate biology–evolution and the relationship between the sequence of a protein, its three-dimensional structure, and its function.  more » « less
Award ID(s):
2055036
PAR ID:
10422525
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Frontiers in Genetics
Volume:
14
ISSN:
1664-8021
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Antibodies are proteins that can protect against disease using a variety of mechanisms, including binding to pathogens and targeting them for destruction. Structural modeling of antibody binding to the SARS-Cov-2 spike protein and how mutations might allow viruses to escape antibody neutralization has been previously investigated in Antibody Engineering Hackathons. The procedure for investigating immune escape can be used for students in affordable and accessible Course-Based Undergraduate Research Experiences (CUREs). In this work, we adapted and expanded the SARS-Cov-2 protocol to address new pathogens, including hookworms, Respiratory Syncytial Virus (RSV), Influenza, and Enterovirus D68. We found each presented unique challenges; however, these challenges present opportunities for student research. We describe how modifications to the SARS-Cov-2 protocol designed for SARS-CoV-2 could allow students to investigate the impact of mutations in each of these pathogens when binding to antibodies. 
    more » « less
  2. Abstract Predicting protein properties from amino acid sequences is an important problem in biology and pharmacology. Protein–protein interactions among SARS-CoV-2 spike protein, human receptors and antibodies are key determinants of the potency of this virus and its ability to evade the human immune response. As a rapidly evolving virus, SARS-CoV-2 has already developed into many variants with considerable variation in virulence among these variants. Utilizing the proteomic data of SARS-CoV-2 to predict its viral characteristics will, therefore, greatly aid in disease control and prevention. In this paper, we review and compare recent successful prediction methods based on long short-term memory (LSTM), transformer, convolutional neural network (CNN) and a similarity-based topological regression (TR) model and offer recommendations about appropriate predictive methodology depending on the similarity between training and test datasets. We compare the effectiveness of these models in predicting the binding affinity and expression of SARS-CoV-2 spike protein sequences. We also explore how effective these predictive methods are when trained on laboratory-created data and are tasked with predicting the binding affinity of the in-the-wild SARS-CoV-2 spike protein sequences obtained from the GISAID datasets. We observe that TR is a better method when the sample size is small and test protein sequences are sufficiently similar to the training sequence. However, when the training sample size is sufficiently large and prediction requires extrapolation, LSTM embedding and CNN-based predictive model show superior performance. 
    more » « less
  3. Disparities in undergraduate STEM degree completions across the United States are a national concern. Undergraduate-level research opportunities are vital for developing future researchers and building their scientific identity. These experiences can help students in community colleges acquire 21st-century skills and build confidence in their ability to do science [1-3]. The development and implementation of guided research experiences provide users with a topic they are familiar with but not necessarily experts in, like SARS-CoV2 infections. In this particular study, the Immune Epitope Database (IEDB) was used to identify amino acid residues located on the immunogenic regions of the spike glycoprotein of SARS-CoV-2 variants: Alpha, Beta, Gamma, Delta, and Omicron. IEDB is a web-based bioinformatics tool that contains published epitope information and prediction aids that can be used as a research platform for studying infectious diseases. The objective of this study aimed to map the immunogenic regions on the spike glycoproteins of the SARS-CoV-2 variants and predict the immune evasion of these variants [4-6]. Identifying the antigenic determinations that bind to the antibodies is essential for designing future candidates for peptide-based vaccines. This study aims to map the immunogenic regions on the spike glycoproteins of the SARS-CoV-2 variants and predict the immune evasion of these variants [4-6]. Identifying the antigenic determinations that bind to the antibodies is essential for designing future candidates for peptide-based vaccines. This research identifies regions where mutations have occurred in the virus, which are important to study as they can affect the virus's immune evasion and impact available vaccines. Targeting multiple immunogenic regions unaffected by mutations can serve as potential targets for new vaccines, providing better protection against different variants. 
    more » « less
  4. Kazarinoff, P. (Ed.)
    Disparities in undergraduate STEM degree completions across the United States are a national concern. Undergraduate-level research opportunities are vital for developing future researchers and building their scientific identity. These experiences can help students in community colleges acquire 21st-century skills and build confidence in their ability to do science [1-3]. The development and implementation of guided research experiences provide users with a topic they are familiar with but not necessarily experts in, like SARS-CoV2 infections. In this particular study, the Immune Epitope Database (IEDB) was used to identify amino acid residues located on the immunogenic regions of the spike glycoprotein of SARS-CoV-2 variants: Alpha, Beta, Gamma, Delta, and Omicron. IEDB is a web-based bioinformatics tool that contains published epitope information and prediction aids that can be used as a research platform for studying infectious diseases. The objective of this study aimed to map the immunogenic regions on the spike glycoproteins of the SARS-CoV-2 variants and predict the immune evasion of these variants [4-6]. Identifying the antigenic determinations that bind to the antibodies is essential for designing future candidates for peptide-based vaccines. This study aims to map the immunogenic regions on the spike glycoproteins of the SARS-CoV-2 variants and predict the immune evasion of these variants [4-6]. Identifying the antigenic determinations that bind to the antibodies is essential for designing future candidates for peptide-based vaccines. This research identifies regions where mutations have occurred in the virus, which are important to study as they can affect the virus’s immune evasion and impact available vaccines. Targeting multiple immunogenic regions unaffected by mutations can serve as potential targets for new vaccines, providing better protection against different variants. 
    more » « less
  5. SARS-CoV-2, the cause of COVID-19, is a new, highly pathogenic coronavirus, which is the third coronavirus to emerge in the past 2 decades and the first to become a global pandemic. The virus has demonstrated itself to be extremely transmissible and deadly. Recent data suggest that a targeted approach is key to mitigating infectivity. Due to the proliferation of cataloged protein and nucleic acid sequences in databases, the function of the nucleic acid, and genetic encoded proteins, we make predictions by simply aligning sequences and exploring their homology. Thus, similar amino acid sequences in a protein usually confer similar biochemical function, even from distal or unrelated organisms. To understand viral transmission and adhesion, it is key to elucidate the structural, surface, and functional properties of each viral protein. This is typically first modeled in highly pathogenic species by exploring folding, hydrophobicity, and isoelectric point (IEP). Recent evidence from viral RNA sequence modeling and protein crystals have been inadequate, which prevent full understanding of the IEP and other viral properties of SARS-CoV-2. We have thus experimentally determined the IEP of SARS-CoV-2. Our findings suggest that for enveloped viruses, such as SARS-CoV-2, estimates of IEP by the amino acid sequence alone may be unreliable. We compared the experimental IEP of SARS-CoV-2 to variants of interest (VOIs) using their amino acid sequence, thus providing a qualitative comparison of the IEP of VOIs. 
    more » « less