skip to main content


Title: Fast de novo discovery of low‐energy protein loop conformations
ABSTRACT

In the prediction of protein structure from amino acid sequence, loops are challenging regions for computational methods. Since loops are often located on the protein surface, they can have significant roles in determining protein functions and binding properties. Loop prediction without the aid of a structural template requires extensive conformational sampling and energy minimization, which are computationally difficult. In this article we present a newde novoloop sampling method, the Parallely filtered Energy Targeted All‐atom Loop Sampler (PETALS) to rapidly locate low energy conformations. PETALS explores both backbone and side‐chain positions of the loop region simultaneously according to the energy function selected by the user, and constructs a nonredundant ensemble of low energy loop conformations using filtering criteria. The method is illustrated with the DFIRE potential and DiSGro energy function for loops, and shown to be highly effective at discovering conformations with near‐native (or better) energy. Using the same energy function as the DiSGro algorithm, PETALS samples conformations with both lower RMSDs and lower energies. PETALS is also useful for assessing the accuracy of different energy functions. PETALS runs rapidly, requiring an average time cost of 10 minutes for a length 12 loop on a single 3.2 GHz processor core, comparable to the fastest existingde novomethods for generating an ensemble of conformations. Proteins 2017; 85:1402–1412. © 2017 Wiley Periodicals, Inc.

 
more » « less
NSF-PAR ID:
10028914
Author(s) / Creator(s):
 ;  ;  
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
Proteins: Structure, Function, and Bioinformatics
Volume:
85
Issue:
8
ISSN:
0887-3585
Page Range / eLocation ID:
p. 1402-1412
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Disulfide‐rich peptides represent an important protein family with broad pharmacological potential. Recent advances in computational methods have made it possible to design new peptides which adopt a stable conformationde novo.Here, we describe a system to produce disulfide‐richde novopeptides usingEscherichia colias the expression host. The advantage of this system is that it enables production of uniformly13C‐ and15N‐labeled peptides for solution nuclear magnetic resonance (NMR) studies. This expression system was used to isotopically label two previously reportedde novodesigned peptides, and to determine their solution structures using NMR. The ensemble of NMR structures calculated for both peptides agreed well with the design models, further confirming the accuracy of the design protocol. Collection of NMR data on the peptides under reducing conditions revealed a dependency on disulfide bonds to maintain stability. Furthermore, we performed long‐time molecular dynamics (MD) simulations with tempering to assess the stability of two families ofde novodesigned peptides. Initial designs which exhibited a stable structure during simulations were more likely to adopt a stable structurein vitro, but attempts to utilize this method to redesign unstable peptides to fold into a stable state were unsuccessful. Further work is therefore needed to assess the utility of MD simulation techniques forde novoprotein design.

     
    more » « less
  2. Abstract

    Electron paramagnetic resonance (EPR) has become a powerful probe of conformational heterogeneity and dynamics of biomolecules. In this Review, we discuss different computational modeling techniques that enrich the interpretation of EPR measurements of dynamics or distance restraints. A variety of spin labels are surveyed to provide a background for the discussion of modeling tools. Molecular dynamics (MD) simulations of models containing spin labels provide dynamical properties of biomolecules and their labels. These simulations can be used to predict EPR spectra, sample stable conformations and sample rotameric preferences of label sidechains. For molecular motions longer than milliseconds, enhanced sampling strategies and de novo prediction software incorporating or validated by EPR measurements are able to efficiently refine or predict protein conformations, respectively. To sample large‐amplitude conformational transition, a coarse‐grained or an atomistic weighted ensemble (WE) strategy can be guided with EPR insights. Looking forward, we anticipate an integrative strategy for efficient sampling of alternate conformations by de novo predictions, followed by validations by systematic EPR measurements and MD simulations. Continuous pathways between alternate states can be further sampled by WE‐MD including all intermediate states.

     
    more » « less
  3. Abstract

    Protein model refinement has been an essential part of successful protein structure prediction. Molecular dynamics simulation‐based refinement methods have shown consistent improvement of protein models. There had been progress in the extent of refinement for a few years since the idea of ensemble averaging of sampled conformations emerged. There was little progress in CASP12 because conformational sampling was not sufficiently diverse due to harmonic restraints. During CASP13, a new refinement method was tested that achieved significant improvements over CASP12. The new method intended to address previous bottlenecks in the refinement problem by introducing new features. Flat‐bottom harmonic restraints replaced harmonic restraints, sampling was performed iteratively, and a new scoring function and selection criteria were used. The new protocol expanded conformational sampling at reduced computational costs. In addition to overall improvements, some models were refined significantly to near‐experimental accuracy.

     
    more » « less
  4. The Class I Major Histocompatibility Complex (MHC) is a central protein in immunology as it binds to intracellular peptides and displays them at the cell surface for recognition by T-cells. The structural analysis of bound peptide-MHC complexes (pMHCs) holds the promise of interpretable and general binding prediction (i.e., testing whether a given peptide binds to a given MHC). However, structural analysis is limited in part by the difficulty in modelling pMHCs given the size and flexibility of the peptides that can be presented by MHCs. This article describes APE-Gen (Anchored Peptide-MHC Ensemble Generator), a fast method for generating ensembles of bound pMHC conformations. APE-Gen generates an ensemble of bound conformations by iterated rounds of (i) anchoring the ends of a given peptide near known pockets in the binding site of the MHC, (ii) sampling peptide backbone conformations with loop modelling, and then (iii) performing energy minimization to fix steric clashes, accumulating conformations at each round. APE-Gen takes only minutes on a standard desktop to generate tens of bound conformations, and we show the ability of APE-Gen to sample conformations found in X-ray crystallography even when only sequence information is used as input. APE-Gen has the potential to be useful for its scalability (i.e., modelling thousands of pMHCs or even non-canonical longer peptides) and for its use as a flexible search tool. We demonstrate an example for studying cross-reactivity. 
    more » « less
  5. Abstract

    The FastDesign protocol in the molecular modeling program Rosetta iterates between sequence optimization and structure refinement to stabilize de novo designed protein structures and complexes. FastDesign has been used previously to design novel protein folds and assemblies with important applications in research and medicine. To promote sampling of alternative conformations and sequences, FastDesign includes stages where the energy landscape is smoothened by reducing repulsive forces. Here, we discover that this process disfavors larger amino acids in the protein core because the protein compresses in the early stages of refinement. By testing alternative ramping strategies for the repulsive weight, we arrive at a scheme that produces lower energy designs with more native‐like sequence composition in the protein core. We further validate the protocol by designing and experimentally characterizing over 4000 proteins and show that the new protocol produces higher stability proteins.

     
    more » « less