skip to main content

Title: Assessment of software methods for estimating protein-protein relative binding affinities
A growing number of computational tools have been developed to accurately and rapidly predict the impact of amino acid mutations on protein-protein relative binding affinities. Such tools have many applications, for example, designing new drugs and studying evolutionary mechanisms. In the search for accuracy, many of these methods employ expensive yet rigorous molecular dynamics simulations. By contrast, non-rigorous methods use less exhaustive statistical mechanics, allowing for more efficient calculations. However, it is unclear if such methods retain enough accuracy to replace rigorous methods in binding affinity calculations. This trade-off between accuracy and computational expense makes it difficult to determine the best method for a particular system or study. Here, eight non-rigorous computational methods were assessed using eight antibody-antigen and eight non-antibody-antigen complexes for their ability to accurately predict relative binding affinities (ΔΔG) for 654 single mutations. In addition to assessing accuracy, we analyzed the CPU cost and performance for each method using a variety of physico-chemical structural features. This allowed us to posit scenarios in which each method may be best utilized. Most methods performed worse when applied to antibody-antigen complexes compared to non-antibody-antigen complexes. Rosetta-based JayZ and EasyE methods classified mutations as destabilizing (ΔΔG < -0.5 kcal/mol) with high more » (83–98%) accuracy and a relatively low computational cost for non-antibody-antigen complexes. Some of the most accurate results for antibody-antigen systems came from combining molecular dynamics with FoldX with a correlation coefficient (r) of 0.46, but this was also the most computationally expensive method. Overall, our results suggest these methods can be used to quickly and accurately predict stabilizing versus destabilizing mutations but are less accurate at predicting actual binding affinities. This study highlights the need for continued development of reliable, accessible, and reproducible methods for predicting binding affinities in antibody-antigen proteins and provides a recipe for using current methods. « less
; ; ; ;
Award ID(s):
Publication Date:
Journal Name:
PloS one
Sponsoring Org:
National Science Foundation
More Like this
  1. Protein–protein binding is fundamental to most biological processes. It is important to be able to use computation to accurately estimate the change in protein–protein binding free energy due to mutations in order to answer biological questions that would be experimentally challenging, laborious, or time-consuming. Although nonrigorous free-energy methods are faster, rigorous alchemical molecular dynamics-based methods are considerably more accurate and are becoming more feasible with the advancement of computer hardware and molecular simulation software. Even with sufficient computational resources, there are still major challenges to using alchemical free-energy methods for protein–protein complexes, such as generating hybrid structures and topologies, maintaining a neutral net charge of the system when there is a charge-changing mutation, and setting up the simulation. In the current study, we have used the pmx package to generate hybrid structures and topologies, and a double-system/single-box approach to maintain the net charge of the system. To test the approach, we predicted relative binding affinities for two protein–protein complexes using a nonequilibrium alchemical method based on the Crooks fluctuation theorem and compared the results with experimental values. The method correctly identified stabilizing from destabilizing mutations for a small protein–protein complex, and a larger, more challenging antibody complex. Strong correlations weremore »obtained between predicted and experimental relative binding affinities for both protein–protein systems.« less
  2. Fariselli, Piero (Ed.)
    Predicting mutation-induced changes in protein thermodynamic stability (ΔΔG) is of great interest in protein engineering, variant interpretation, and protein biophysics. We introduce ThermoNet, a deep, 3D-convolutional neural network (3D-CNN) designed for structure-based prediction of ΔΔGs upon point mutation. To leverage the image-processing power inherent in CNNs, we treat protein structures as if they were multi-channel 3D images. In particular, the inputs to ThermoNet are uniformly constructed as multi-channel voxel grids based on biophysical properties derived from raw atom coordinates. We train and evaluate ThermoNet with a curated data set that accounts for protein homology and is balanced with direct and reverse mutations; this provides a framework for addressing biases that have likely influenced many previous ΔΔG prediction methods. ThermoNet demonstrates performance comparable to the best available methods on the widely used S sym test set. In addition, ThermoNet accurately predicts the effects of both stabilizing and destabilizing mutations, while most other methods exhibit a strong bias towards predicting destabilization. We further show that homology between S sym and widely used training sets like S2648 and VariBench has likely led to overestimated performance in previous studies. Finally, we demonstrate the practical utility of ThermoNet in predicting the ΔΔGs for two clinicallymore »relevant proteins, p53 and myoglobin, and for pathogenic and benign missense variants from ClinVar. Overall, our results suggest that 3D-CNNs can model the complex, non-linear interactions perturbed by mutations, directly from biophysical properties of atoms.« less
  3. Methods which accurately predict protein – ligand binding strengths are critical for drug discovery. In the last two decades, advances in chemical modelling have enabled steadily accelerating progress in the discovery and optimization of structure-based drug design. Most computational methods currently used in this context are based on molecular mechanics force fields that often have deficiencies in describing the quantum mechanical (QM) aspects of molecular binding. In this study, we show the competitiveness of our QM-based Molecules-in-Molecules (MIM) fragmentation method for characterizing binding energy trends for seven different datasets of protein – ligand complexes. By using molecular fragmentation, the MIM method allows for accelerated QM calculations. We demonstrate that for classes of structurally similar ligands bound to a common receptor, MIM provides excellent correlation to experiment, surpassing the more popular Molecular Mechanics Poisson-Boltzmann Surface Area (MM/PBSA) and Molecular Mechanics Generalized Born Surface Area (MM/GBSA) methods. The MIM method offers a relatively simple, well-defined protocol by which binding trends can be ascertained at the QM level and is suggested as a promising option for lead optimization in structure-based drug design.
  4. Accurate and efficient simulation of the thermodynamics and kinetics of protein–ligand interactions is crucial for computational drug discovery. Multiensemble Markov Model (MEMM) estimators can provide estimates of both binding rates and affinities from collections of short trajectories but have not been systematically explored for situations when a ligand is decoupled through scaling of non-bonded interactions. In this work, we compare the performance of two MEMM approaches for estimating ligand binding affinities and rates: (1) the transition-based reweighting analysis method (TRAM) and (2) a Maximum Caliber (MaxCal) based method. As a test system, we construct a small host–guest system where the ligand is a single uncharged Lennard-Jones (LJ) particle, and the receptor is an 11-particle icosahedral pocket made from the same atom type. To realistically mimic a protein–ligand binding system, the LJ ϵ parameter was tuned, and the system was placed in a periodic box with 860 TIP3P water molecules. A benchmark was performed using over 80 µs of unbiased simulation, and an 18-state Markov state model was used to estimate reference binding affinities and rates. We then tested the performance of TRAM and MaxCal when challenged with limited data. Both TRAM and MaxCal approaches perform better than conventional Markov statemore »models, with TRAM showing better convergence and accuracy. We find that subsampling of trajectories to remove time correlation improves the accuracy of both TRAM and MaxCal and that in most cases, only a single biased ensemble to enhance sampled transitions is required to make accurate estimates.

    « less
  5. In the global health emergency caused by coronavirus disease 2019 (COVID-19), efficient and specific therapies are urgently needed. Compared with traditional small-molecular drugs, antibody therapies are relatively easy to develop; they are as specific as vaccines in targeting severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2); and they have thus attracted much attention in the past few months. This article reviews seven existing antibodies for neutralizing SARS-CoV-2 with 3D structures deposited in the Protein Data Bank (PDB). Five 3D antibody structures associated with the SARS-CoV spike (S) protein are also evaluated for their potential in neutralizing SARS-CoV-2. The interactions of these antibodies with the S protein receptor-binding domain (RBD) are compared with those between angiotensin-converting enzyme 2 and RBD complexes. Due to the orders of magnitude in the discrepancies of experimental binding affinities, we introduce topological data analysis, a variety of network models, and deep learning to analyze the binding strength and therapeutic potential of the 14 antibody–antigen complexes. The current COVID-19 antibody clinical trials, which are not limited to the S protein target, are also reviewed.