skip to main content


Title: Universal protein misfolding intermediates can bypass the proteostasis network and remain soluble and less functional
Abstract Some misfolded protein conformations can bypass proteostasis machinery and remain soluble in vivo. This is an unexpected observation, as cellular quality control mechanisms should remove misfolded proteins. Three questions, then, are: how do long-lived, soluble, misfolded proteins bypass proteostasis? How widespread are such misfolded states? And how long do they persist? We address these questions using coarse-grain molecular dynamics simulations of the synthesis, termination, and post-translational dynamics of a representative set of cytosolic E. coli proteins. We predict that half of proteins exhibit misfolded subpopulations that bypass molecular chaperones, avoid aggregation, and will not be rapidly degraded, with some misfolded states persisting for months or longer. The surface properties of these misfolded states are native-like, suggesting they will remain soluble, while self-entanglements make them long-lived kinetic traps. In terms of function, we predict that one-third of proteins can misfold into soluble less-functional states. For the heavily entangled protein glycerol-3-phosphate dehydrogenase, limited-proteolysis mass spectrometry experiments interrogating misfolded conformations of the protein are consistent with the structural changes predicted by our simulations. These results therefore provide an explanation for how proteins can misfold into soluble conformations with reduced functionality that can bypass proteostasis, and indicate, unexpectedly, this may be a wide-spread phenomenon.  more » « less
Award ID(s):
2045844
NSF-PAR ID:
10388186
Author(s) / Creator(s):
; ; ; ; ; ; ; ;
Date Published:
Journal Name:
Nature Communications
Volume:
13
Issue:
1
ISSN:
2041-1723
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The ubiquitin-26S proteasome system and autophagy are two major protein degradation machineries encoded in all eukaryotic organisms. While the UPS is responsible for the turnover of short-lived and/or soluble misfolded proteins under normal growth conditions, the autophagy-lysosomal/vacuolar protein degradation machinery is activated under stress conditions to remove long-lived proteins in the forms of aggregates, either soluble or insoluble, in the cytoplasm and damaged organelles. Recent discoveries suggested an integrative function of these two seemly independent systems for maintaining the proteome homeostasis. One such integration is represented by their reciprocal degradation, in which the small 76-amino acid peptide, ubiquitin, plays an important role as the central signaling hub. In this review, we summarized the current knowledge about the activity control of proteasome and autophagosome at their structural organization, biophysical states, and turnover levels from yeast and mammals to plants. Through comprehensive literature studies, we presented puzzling questions that are awaiting to be solved and proposed exciting new research directions that may shed light on the molecular mechanisms underlying the biological function of protein degradation. 
    more » « less
  2. Abstract

    Many proteins must interact with molecular chaperones to achieve their native state in the cell. Yet, how chaperone binding‐site characteristics affect the folding process is poorly understood. The ubiquitous Hsp70 chaperone system prevents client‐protein aggregation by holding unfolded conformations and by unfolding misfolded states. Hsp70 binding sites of client proteins comprise a nonpolar core surrounded by positively charged residues. However, a detailed analysis of Hsp70 binding sites on a proteome‐wide scale is still lacking. Further, it is not known whether proteins undergo some degree of folding while chaperone bound. Here, we begin to address the above questions by identifying Hsp70 binding sites in 2258Escherichia coli(E. coli) proteins. We find that most proteins bear at least one Hsp70 binding site and that the number of Hsp70 binding sites is directly proportional to protein size. Aggregation propensity upon release from the ribosome correlates with number of Hsp70 binding sites only in the case of large proteins. Interestingly, Hsp70 binding sites are more solvent‐exposed than other nonpolar sites, in protein native states. Our findings show that the majority ofE. coliproteins are systematically enabled to interact with Hsp70 even if this interaction only takes place during a fraction of the protein lifetime. In addition, our data suggest that some conformational sampling may take place within Hsp70‐bound states, due to the solvent exposure of some chaperone binding sites in native proteins. In all, we propose that Hsp70‐chaperone‐binding traits have evolved to favor Hsp70‐assisted protein folding devoid of aggregation.

     
    more » « less
  3. This data set for the manuscript entitled "Design of Peptides that Fold and Self-Assemble on Graphite" includes all files needed to run and analyze the simulations described in the this manuscript in the molecular dynamics software NAMD, as well as the output of the simulations. The files are organized into directories corresponding to the figures of the main text and supporting information. They include molecular model structure files (NAMD psf or Amber prmtop format), force field parameter files (in CHARMM format), initial atomic coordinates (pdb format), NAMD configuration files, Colvars configuration files, NAMD log files, and NAMD output including restart files (in binary NAMD format) and trajectories in dcd format (downsampled to 10 ns per frame). Analysis is controlled by shell scripts (Bash-compatible) that call VMD Tcl scripts or python scripts. These scripts and their output are also included.

    Version: 2.0

    Changes versus version 1.0 are the addition of the free energy of folding, adsorption, and pairing calculations (Sim_Figure-7) and shifting of the figure numbers to accommodate this addition.


    Conventions Used in These Files
    ===============================

    Structure Files
    ----------------
    - graph_*.psf or sol_*.psf (original NAMD (XPLOR?) format psf file including atom details (type, charge, mass), as well as definitions of bonds, angles, dihedrals, and impropers for each dipeptide.)

    - graph_*.pdb or sol_*.pdb (initial coordinates before equilibration)
    - repart_*.psf (same as the above psf files, but the masses of non-water hydrogen atoms have been repartitioned by VMD script repartitionMass.tcl)
    - freeTop_*.pdb (same as the above pdb files, but the carbons of the lower graphene layer have been placed at a single z value and marked for restraints in NAMD)
    - amber_*.prmtop (combined topology and parameter files for Amber force field simulations)
    - repart_amber_*.prmtop (same as the above prmtop files, but the masses of non-water hydrogen atoms have been repartitioned by ParmEd)

    Force Field Parameters
    ----------------------
    CHARMM format parameter files:
    - par_all36m_prot.prm (CHARMM36m FF for proteins)
    - par_all36_cgenff_no_nbfix.prm (CGenFF v4.4 for graphene) The NBFIX parameters are commented out since they are only needed for aromatic halogens and we use only the CG2R61 type for graphene.
    - toppar_water_ions_prot_cgenff.str (CHARMM water and ions with NBFIX parameters needed for protein and CGenFF included and others commented out)

    Template NAMD Configuration Files
    ---------------------------------
    These contain the most commonly used simulation parameters. They are called by the other NAMD configuration files (which are in the namd/ subdirectory):
    - template_min.namd (minimization)
    - template_eq.namd (NPT equilibration with lower graphene fixed)
    - template_abf.namd (for adaptive biasing force)

    Minimization
    -------------
    - namd/min_*.0.namd

    Equilibration
    -------------
    - namd/eq_*.0.namd

    Adaptive biasing force calculations
    -----------------------------------
    - namd/eabfZRest7_graph_chp1404.0.namd
    - namd/eabfZRest7_graph_chp1404.1.namd (continuation of eabfZRest7_graph_chp1404.0.namd)

    Log Files
    ---------
    For each NAMD configuration file given in the last two sections, there is a log file with the same prefix, which gives the text output of NAMD. For instance, the output of namd/eabfZRest7_graph_chp1404.0.namd is eabfZRest7_graph_chp1404.0.log.

    Simulation Output
    -----------------
    The simulation output files (which match the names of the NAMD configuration files) are in the output/ directory. Files with the extensions .coor, .vel, and .xsc are coordinates in NAMD binary format, velocities in NAMD binary format, and extended system information (including cell size) in text format. Files with the extension .dcd give the trajectory of the atomic coorinates over time (and also include system cell information). Due to storage limitations, large DCD files have been omitted or replaced with new DCD files having the prefix stride50_ including only every 50 frames. The time between frames in these files is 50 * 50000 steps/frame * 4 fs/step = 10 ns. The system cell trajectory is also included for the NPT runs are output/eq_*.xst.

    Scripts
    -------
    Files with the .sh extension can be found throughout. These usually provide the highest level control for submission of simulations and analysis. Look to these as a guide to what is happening. If there are scripts with step1_*.sh and step2_*.sh, they are intended to be run in order, with step1_*.sh first.


    CONTENTS
    ========

    The directory contents are as follows. The directories Sim_Figure-1 and Sim_Figure-8 include README.txt files that describe the files and naming conventions used throughout this data set.

    Sim_Figure-1: Simulations of N-acetylated C-amidated amino acids (Ac-X-NHMe) at the graphite–water interface.

    Sim_Figure-2: Simulations of different peptide designs (including acyclic, disulfide cyclized, and N-to-C cyclized) at the graphite–water interface.

    Sim_Figure-3: MM-GBSA calculations of different peptide sequences for a folded conformation and 5 misfolded/unfolded conformations.

    Sim_Figure-4: Simulation of four peptide molecules with the sequence cyc(GTGSGTG-GPGG-GCGTGTG-SGPG) at the graphite–water interface at 370 K.

    Sim_Figure-5: Simulation of four peptide molecules with the sequence cyc(GTGSGTG-GPGG-GCGTGTG-SGPG) at the graphite–water interface at 295 K.

    Sim_Figure-5_replica: Temperature replica exchange molecular dynamics simulations for the peptide cyc(GTGSGTG-GPGG-GCGTGTG-SGPG) with 20 replicas for temperatures from 295 to 454 K.

    Sim_Figure-6: Simulation of the peptide molecule cyc(GTGSGTG-GPGG-GCGTGTG-SGPG) in free solution (no graphite).

    Sim_Figure-7: Free energy calculations for folding, adsorption, and pairing for the peptide CHP1404 (sequence: cyc(GTGSGTG-GPGG-GCGTGTG-SGPG)). For folding, we calculate the PMF as function of RMSD by replica-exchange umbrella sampling (in the subdirectory Folding_CHP1404_Graphene/). We make the same calculation in solution, which required 3 seperate replica-exchange umbrella sampling calculations (in the subdirectory Folding_CHP1404_Solution/). Both PMF of RMSD calculations for the scrambled peptide are in Folding_scram1404/. For adsorption, calculation of the PMF for the orientational restraints and the calculation of the PMF along z (the distance between the graphene sheet and the center of mass of the peptide) are in Adsorption_CHP1404/ and Adsorption_scram1404/. The actual calculation of the free energy is done by a shell script ("doRestraintEnergyError.sh") in the 1_free_energy/ subsubdirectory. Processing of the PMFs must be done first in the 0_pmf/ subsubdirectory. Finally, files for free energy calculations of pair formation for CHP1404 are found in the Pair/ subdirectory.

    Sim_Figure-8: Simulation of four peptide molecules with the sequence cyc(GTGSGTG-GPGG-GCGTGTG-SGPG) where the peptides are far above the graphene–water interface in the initial configuration.

    Sim_Figure-9: Two replicates of a simulation of nine peptide molecules with the sequence cyc(GTGSGTG-GPGG-GCGTGTG-SGPG) at the graphite–water interface at 370 K.

    Sim_Figure-9_scrambled: Two replicates of a simulation of nine peptide molecules with the control sequence cyc(GGTPTTGGGGGGSGGPSGTGGC) at the graphite–water interface at 370 K.

    Sim_Figure-10: Adaptive biasing for calculation of the free energy of the folded peptide as a function of the angle between its long axis and the zigzag directions of the underlying graphene sheet.

     

    This material is based upon work supported by the US National Science Foundation under grant no. DMR-1945589. A majority of the computing for this project was performed on the Beocat Research Cluster at Kansas State University, which is funded in part by NSF grants CHE-1726332, CNS-1006860, EPS-1006860, and EPS-0919443. This work used the Extreme Science and Engineering Discovery Environment (XSEDE), which is supported by National Science Foundation grant number ACI-1548562, through allocation BIO200030. 
    more » « less
  4. The journey by which proteins navigate their energy landscapes to their native structures is complex, involving (and sometimes requiring) many cellular factors and processes operating in partnership with a given polypeptide chain’s intrinsic energy landscape. The cytosolic environment and its complement of chaperones play critical roles in granting many proteins safe passage to their native states; however, it is challenging to interrogate the folding process for large numbers of proteins in a complex background with most biophysical techniques. Hence, most chaperone-assisted protein refolding studies are conducted in defined buffers on single purified clients. Here, we develop a limited proteolysis–mass spectrometry approach paired with an isotope-labeling strategy to globally monitor the structures of refolding  Escherichia coli proteins in the cytosolic medium and with the chaperones, GroEL/ES (Hsp60) and DnaK/DnaJ/GrpE (Hsp70/40). GroEL can refold the majority (85%) of the E. coli proteins for which we have data and is particularly important for restoring acidic proteins and proteins with high molecular weight, trends that come to light because our assay measures the structural outcome of the refolding process itself, rather than binding or aggregation. For the most part, DnaK and GroEL refold a similar set of proteins, supporting the view that despite their vastly different structures, these two chaperones unfold misfolded states, as one mechanism in common. Finally, we identify a cohort of proteins that are intransigent to being refolded with either chaperone. We suggest that these proteins may fold most efficiently cotranslationally, and then remain kinetically trapped in their native conformations. 
    more » « less
  5. Abstract

    Folding of ribozymes into well-defined tertiary structures usually requires divalent cations. How Mg2+ ions direct the folding kinetics has been a long-standing unsolved problem because experiments cannot detect the positions and dynamics of ions. To address this problem, we used molecular simulations to dissect the folding kinetics of the Azoarcus ribozyme by monitoring the path each molecule takes to reach the folded state. We quantitatively establish that Mg2+ binding to specific sites, coupled with counter-ion release of monovalent cations, stimulate the formation of secondary and tertiary structures, leading to diverse pathways that include direct rapid folding and trapping in misfolded structures. In some molecules, key tertiary structural elements form when Mg2+ ions bind to specific RNA sites at the earliest stages of the folding, leading to specific collapse and rapid folding. In others, the formation of non-native base pairs, whose rearrangement is needed to reach the folded state, is the rate-limiting step. Escape from energetic traps, driven by thermal fluctuations, occurs readily. In contrast, the transition to the native state from long-lived topologically trapped native-like metastable states is extremely slow. Specific collapse and formation of energetically or topologically frustrated states occur early in the assembly process.

     
    more » « less