skip to main content

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 11:00 PM ET on Thursday, January 16 until 2:00 AM ET on Friday, January 17 due to maintenance. We apologize for the inconvenience.


Title: Intramolecular structural heterogeneity altered by long-range contacts in an intrinsically disordered protein

Short-range interactions and long-range contacts drive the 3D folding of structured proteins. The proteins’ structure has a direct impact on their biological function. However, nearly 40% of the eukaryotes proteome is composed of intrinsically disordered proteins (IDPs) and protein regions that fluctuate between ensembles of numerous conformations. Therefore, to understand their biological function, it is critical to depict how the structural ensemble statistics correlate to the IDPs’ amino acid sequence. Here, using small-angle X-ray scattering and time-resolved Förster resonance energy transfer (trFRET), we study the intramolecular structural heterogeneity of the neurofilament low intrinsically disordered tail domain (NFLt). Using theoretical results of polymer physics, we find that the Flory scaling exponent of NFLt subsegments correlates linearly with their net charge, ranging from statistics of ideal to self-avoiding chains. Surprisingly, measuring the same segments in the context of the whole NFLt protein, we find that regardless of the peptide sequence, the segments’ structural statistics are more expanded than when measured independently. Our findings show that while polymer physics can, to some level, relate the IDP’s sequence to its ensemble conformations, long-range contacts between distant amino acids play a crucial role in determining intramolecular structures. This emphasizes the necessity of advanced polymer theories to fully describe IDPs ensembles with the hope that it will allow us to model their biological function.

 
more » « less
Award ID(s):
2113302
PAR ID:
10519975
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ; ;
Publisher / Repository:
PNAS
Date Published:
Journal Name:
Proceedings of the National Academy of Sciences
Volume:
120
Issue:
30
ISSN:
0027-8424
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Intrinsically disordered proteins and protein regions (IDPs) are prevalent in all proteomes and are essential to cellular function. Unlike folded proteins, IDPs exist in an ensemble of dissimilar conformations. Despite this structural plasticity, intramolecular interactions create sequence-specific structural biases that determine an IDP ensemble’s three-dimensional shape. Such structural biases can be key to IDP function and are often measured in vitro, but whether those biases are preserved inside the cell is unclear. Here we show that structural biases in IDP ensembles found in vitro are recapitulated inside human-derived cells. We further reveal that structural biases can change in a sequence-dependent manner due to changes in the intracellular milieu, subcellular localization, and intramolecular interactions with tethered well-folded domains. We propose that the structural sensitivity of IDP ensembles can be leveraged for biological function, can be the underlying cause of IDP-driven pathology or can be used to design disorder-based biosensors and actuators.

     
    more » « less
  2. Abstract

    Proteins are inherently dynamic, and their conformational ensembles are functionally important in biology. Large-scale motions may govern protein structure–function relationship, and numerous transient but stable conformations of intrinsically disordered proteins (IDPs) can play a crucial role in biological function. Investigating conformational ensembles to understand regulations and disease-related aggregations of IDPs is challenging both experimentally and computationally. In this paper we first introduced an unsupervised deep learning-based model, termed Internal Coordinate Net (ICoN), which learns the physical principles of conformational changes from molecular dynamics (MD) simulation data. Second, we selected interpolating data points in the learned latent space that rapidly identify novel synthetic conformations with sophisticated and large-scale sidechains and backbone arrangements. Third, with the highly dynamic amyloid-β1-42(Aβ42) monomer, our deep learning model provided a comprehensive sampling of Aβ42’s conformational landscape. Analysis of these synthetic conformations revealed conformational clusters that can be used to rationalize experimental findings. Additionally, the method can identify novel conformations with important interactions in atomistic details that are not included in the training data. New synthetic conformations showed distinct sidechain rearrangements that are probed by our EPR and amino acid substitution studies. This approach is highly transferable and can be used for any available data for training. The work also demonstrated the ability for deep learning to utilize learned natural atomistic motions in protein conformation sampling.

     
    more » « less
  3. Disordered binding regions (DBRs), which are embedded within intrinsically disordered proteins or regions (IDPs or IDRs), enable IDPs or IDRs to mediate multiple protein-protein interactions. DBR-protein complexes were collected from the Protein Data Bank for which two or more DBRs having different amino acid sequences bind to the same (100% sequence identical) globular protein partner, a type of interaction herein called many-to-one binding. Two distinct binding profiles were identified: independent and overlapping. For the overlapping binding profiles, the distinct DBRs interact by means of almost identical binding sites (herein called “similar”), or the binding sites contain both common and divergent interaction residues (herein called “intersecting”). Further analysis of the sequence and structural differences among these three groups indicate how IDP flexibility allows different segments to adjust to similar, intersecting, and independent binding pockets. 
    more » « less
  4. We propose a framework to convert the protein intrinsic disorder content to structural entropy (H) using Shannon’s information theory (IT). The structural capacity (C), which is the sum of H and structural information (I), is equal to the amino acid sequence length of the protein. The structural entropy of the residues expands a continuous spectrum, ranging from 0 (fully ordered) to 1 (fully disordered), consistent with Shannon’s IT, which scores the fully-determined state 0 and the fully-uncertain state 1. The intrinsically disordered proteins (IDPs) in a living cell may participate in maintaining the high-energy-low-entropy state. In addition, under this framework, the biological functions performed by proteins and associated with the order or disorder of their 3D structures could be explained in terms of information-gains or entropy-losses, or the reverse processes. 
    more » « less
  5. Much attention is being paid to conformational biases in the ensembles of intrinsically disordered proteins. However, it is currently unknown whether or how conformational biases within the disordered ensembles of foldable proteins affect function in vivo. Recently, we demonstrated that water can be a good solvent for unfolded polypeptide chains, even those with a hydrophobic and charged sequence composition typical of folded proteins. These results run counter to the generally accepted model that protein folding begins with hydrophobicity-driven chain collapse. Here we investigate what other features, beyond amino acid composition, govern chain collapse. We found that local clustering of hydrophobic and/or charged residues leads to significant collapse of the unfolded ensemble of pertactin, a secreted autotransporter virulence protein fromBordetella pertussis, as measured by small angle X-ray scattering (SAXS). Sequence patterns that lead to collapse also correlate with increased intermolecular polypeptide chain association and aggregation. Crucially, sequence patterns that support an expanded conformational ensemble enhance pertactin secretion to the bacterial cell surface. Similar sequence pattern features are enriched across the large and diverse family of autotransporter virulence proteins, suggesting sequence patterns that favor an expanded conformational ensemble are under selection for efficient autotransporter protein secretion, a necessary prerequisite for virulence. More broadly, we found that sequence patterns that lead to more expanded conformational ensembles are enriched across water-soluble proteins in general, suggesting protein sequences are under selection to regulate collapse and minimize protein aggregation, in addition to their roles in stabilizing folded protein structures.

     
    more » « less