Abstract The ability to accurately identify peptide ligands for a given major histocompatibility complex class I (MHC-I) molecule has immense value for targeted anticancer therapeutics. However, the highly polymorphic nature of the MHC-I protein makes universal prediction of peptide ligands challenging due to lack of experimental data describing most MHC-I variants. To address this challenge, we have developed a deep convolutional neural network, HLA-Inception, capable of predicting MHC-I peptide binding motifs using electrostatic properties of the MHC-I binding pocket. By approaching this immunological issue using molecular biophysics, we measure the impact of sidechain arrangement and topology on peptide binding, feature not captured by sequence-based MHC-I prediction methods. Through a combination of molecular modeling and simulation, 5821 MHC-I alleles were modeled, providing extensive coverage across human populations. Predicted peptide binding motifs fell into distinct clusters, each defined with different degrees of submotif heterogeneity. Peptide binding scores generated by HLA-Inception are strongly correlated with quantitative MHC-I binding data, indicating predicted peptides can be ranked, both within and between alleles. HLA-inception also showed high precision when predicting naturally presented peptides and can be used for rapid proteome-scale MHC-I peptide binding predictions. Finally, we show that the binding pocket diversity measured by HLA inception predicts response to checkpoint blockade. Citation Format: Eric A. Wilson, John Kevin Cava, Diego Chowell, Abhishek Singharoy, Karen S. Anderson. Protein structure-based modeling to improve MHC class I epitope predictions. [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2023; Part 1 (Regular and Invited Abstracts); 2023 Apr 14-19; Orlando, FL. Philadelphia (PA): AACR; Cancer Res 2023;83(7_Suppl):Abstract nr 5376.
more »
« less
A Paired Database of Predicted and Experimental Protein Peptide Binding Information
Abstract Peptides are important biomolecules, and their interactions with proteins make them useful in sensing and therapeutic applications. Computational peptide design methods can benefit from high-quality peptide-protein structures paired with thermodynamic data. The Predicted and Experimental Peptide Binding Information (PEPBI) database provides 329 predicted peptide-protein complexes, each based on an experimentally determined structure, with corresponding experimental measurements of changes in Gibbs free energy, enthalpy, and entropy. For each complex, 40 properties calculated using Rosetta’s Interface Analyzer are included. Complexes were selected for inclusion in PEPBI using eight stringent structural criteria, including peptide length (5–20 residues), structure resolution (≤2.0 Å), less than 30% sequence identity between complexes, and having a corresponding unbound protein structure in the Protein Data Bank with at least 90% sequence identity to the bound form with minimal changes in the binding pocket. PEPBI is expected to be of use for the development of computational methods for peptide design with desired binding properties to protein targets.
more »
« less
- Award ID(s):
- 2119237
- PAR ID:
- 10664076
- Publisher / Repository:
- Springer Nature
- Date Published:
- Journal Name:
- Scientific Data
- Volume:
- 12
- Issue:
- 1
- ISSN:
- 2052-4463
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract Structural, regulatory and enzymatic proteins interact with DNA to maintain a healthy and functional genome. Yet, our structural understanding of how proteins interact with DNA is limited. We present MELD-DNA, a novel computational approach to predict the structures of protein–DNA complexes. The method combines molecular dynamics simulations with general knowledge or experimental information through Bayesian inference. The physical model is sensitive to sequence-dependent properties and conformational changes required for binding, while information accelerates sampling of bound conformations. MELD-DNA can: (i) sample multiple binding modes; (ii) identify the preferred binding mode from the ensembles; and (iii) provide qualitative binding preferences between DNA sequences. We first assess performance on a dataset of 15 protein–DNA complexes and compare it with state-of-the-art methodologies. Furthermore, for three selected complexes, we show sequence dependence effects of binding in MELD predictions. We expect that the results presented herein, together with the freely available software, will impact structural biology (by complementing DNA structural databases) and molecular recognition (by bringing new insights into aspects governing protein–DNA interactions).more » « less
-
Cowen, Lenore (Ed.)Abstract Motivationmetal-binding proteins have a central role in maintaining life processes. Nearly one-third of known protein structures contain metal ions that are used for a variety of needs, such as catalysis, DNA/RNA binding, protein structure stability, etc. Identifying metal-binding proteins is thus crucial for understanding the mechanisms of cellular activity. However, experimental annotation of protein metal-binding potential is severely lacking, while computational techniques are often imprecise and of limited applicability. Resultswe developed a novel machine learning-based method, mebipred, for identifying metal-binding proteins from sequence-derived features. This method is over 80% accurate in recognizing proteins that bind metal ion-containing ligands; the specific identity of 11 ubiquitously present metal ions can also be annotated. mebipred is reference-free, i.e. no sequence alignments are involved, and is thus faster than alignment-based methods; it is also more accurate than other sequence-based prediction methods. Additionally, mebipred can identify protein metal-binding capabilities from short sequence stretches, e.g. translated sequencing reads, and, thus, may be useful for the annotation of metal requirements of metagenomic samples. We performed an analysis of available microbiome data and found that ocean, hot spring sediments and soil microbiomes use a more diverse set of metals than human host-related ones. For human microbiomes, physiological conditions explain the observed metal preferences. Similarly, subtle changes in ocean sample ion concentration affect the abundance of relevant metal-binding proteins. These results highlight mebipred’s utility in analyzing microbiome metal requirements. Availability and implementationmebipred is available as a web server at services.bromberglab.org/mebipred and as a standalone package at https://pypi.org/project/mymetal/. Supplementary informationSupplementary data are available at Bioinformatics online.more » « less
-
ABSTRACT Accurate prediction of protein–peptide complex structures plays a critical role in structure‐based drug design, including antibody design. Most peptide‐docking benchmark studies were conducted using crystal structures of protein–peptide complexes; as such, the performance of the current peptide docking tools in the practical setting is unknown. Here, the practical setting implies there are no crystal or other experimental structures for the complex, nor for the receptor and peptide. In this work, we have developed a practical docking protocol that incorporated two famous machine learning models, AlphaFold 2 for structural prediction and ANI‐2x for ab initio potential prediction, to achieve a high success rate in modeling protein–peptide complex structures. The docking protocol consists of three major stages. In the first stage, the 3D structure of the receptor is predicted by AlphaFold 2 using the monomer mode, and that of the peptide is predicted by AlphaFold 2 using the multimer mode. We found that it is essential to include the receptor information to generate a high‐quality 3D structure of the peptide. In the second stage, rigid protein–peptide docking is performed using ZDOCK software. In the last stage, the top 10 docking poses are relaxed and refined by ANI‐2x in conjunction with our in‐house geometry optimization algorithm—conjugate gradient with backtracking line search (CG‐BS). CG‐BS was developed by us to more efficiently perform geometry optimization, which takes the potential and force directly from ANI‐2x machine learning models. The docking protocol achieved a very encouraging performance for a set of 62 very challenging protein–peptide systems which had an overall success rate of 34% if only the top 1 docking poses were considered. This success rate increased to 45% if the top 3 docking poses were considered. It is emphasized that this encouraging protein–peptide docking performance was achieved without using any crystal or experimental structures.more » « less
-
The controlled formation of nanoparticles with optimum characteristics and functional aspects has proven successful via peptide-mediated nanoparticle synthesis. However, the effects of the peptide sequence and binding motif on surface features and physicochemical properties of nanoparticles are not well-understood. In this study, we investigate in a comparative manner how a specific peptide known as Pd4 and its two known variants may form nanoparticles both in an isolated state and when attached to a green fluorescent protein (GFPuv). More importantly, we introduce a novel computational approach to predict the trend of the size and activity of the peptide-directed nanoparticles by estimating the binding affinity of the peptide to a single ion. We used molecular dynamics (MD) simulations to explore the differential behavior of the isolated and GFP-fused peptides and their mutants. Our computed palladium (Pd) binding free energies match the typical nanoparticle sizes reported from transmission electron microscope pictures. Stille coupling and Suzuki–Miyaura reaction turnover frequencies (TOFs) also correspond with computationally predicted Pd binding affinities. The results show that while using Pd4 and its two known variants (A6 and A11) in isolation produces nanoparticles of varying sizes, fusing these peptides to the GFPuv protein produces nanoparticles of similar sizes and activity. In other words, GFPuv reduces the sensitivity of the nanoparticles to the peptide sequence. This study provides a computational framework for designing free and protein-attached peptides that helps in the synthesis of nanoparticles with well-regulated properties.more » « less
An official website of the United States government

