skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Award ID contains: 2128068

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Abstract Intrinsically disordered regions (IDRs) are ubiquitous across all domains of life and play a range of functional roles. While folded domains are generally well described by a stable three-dimensional structure, IDRs exist in a collection of interconverting states known as an ensemble. This structural heterogeneity means that IDRs are largely absent from the Protein Data Bank, contributing to a lack of computational approaches to predict ensemble conformational properties from sequence. Here we combine rational sequence design, large-scale molecular simulations and deep learning to develop ALBATROSS, a deep-learning model for predicting ensemble dimensions of IDRs, including the radius of gyration, end-to-end distance, polymer-scaling exponent and ensemble asphericity, directly from sequences at a proteome-wide scale. ALBATROSS is lightweight, easy to use and accessible as both a locally installable software package and a point-and-click-style interface via Google Colab notebooks. We first demonstrate the applicability of our predictors by examining the generalizability of sequence–ensemble relationships in IDRs. Then, we leverage the high-throughput nature of ALBATROSS to characterize the sequence-specific biophysical behavior of IDRs within and between proteomes. 
    more » « less
  2. ABSTRACT Intrinsically disordered protein regions (IDRs) are ubiquitous across all kingdoms of life and play a variety of essential cellular roles. IDRs exist in a collection of structurally distinct conformers known as an ensemble. An IDR’s amino acid sequence determines its ensemble, which in turn can play an important role in dictating molecular function. Yet a clear link connecting IDR sequence, its ensemble properties, and its molecular function in living cells has not been directly established. Here, we set out to test this sequence-ensemble-function paradigm using a novel computational method (GOOSE) that enables the rational design of libraries of IDRs by systematically varying specific sequence properties. Using ensemble FRET, we measured the ensemble dimensions of a library of rationally designed IDRs in human-derived cell lines, revealing how IDR sequence influences ensemble dimensionsin situ.Furthermore, we show that the interplay between sequence and ensemble can tune an IDR’s ability to sense changes in cell volume - ade novomolecular function for these synthetic sequences. Our results establish biophysical rules for intracellular sequence-ensemble relationships, enable a new route for understanding how IDR sequences map to function in live cells, and set the ground for the design of synthetic IDRs withde novofunction. 
    more » « less
  3. Abstract To survive extreme drying (anhydrobiosis), many organisms, spanning every kingdom of life, accumulate intrinsically disordered proteins (IDPs). For decades, the ability of anhydrobiosis‐related IDPs to form transient amphipathic helices has been suggested to be important for promoting desiccation tolerance. However, evidence empirically supporting the necessity and/or sufficiency of helicity in mediating anhydrobiosis is lacking. Here, we demonstrate that the linker region of CAHS D, a desiccation‐related IDP from the tardigradeHypsibius exemplaris, that contains significant helical structure, is the protective portion of this protein. Perturbing the sequence composition and grammar of the linker region of CAHS D, through the insertion of helix‐breaking prolines, modulating the identity of charged residues, or replacement of hydrophobic amino acids with serine or glycine residues results in variants with different degrees of helical structure. Importantly, correlation of protective capacity and helical content in variants generated through different helix perturbing modalities does not show as strong a trend, suggesting that while helicity is important, it is not the only property that makes a protein protective during desiccation. These results provide direct evidence for the decades‐old theory that helicity of desiccation‐related IDPs is linked to their anhydrobiotic capacity. 
    more » « less
  4. Abstract Tardigrades are microscopic animals that survive desiccation by inducing biostasis. To survive drying tardigrades rely on intrinsically disordered CAHS proteins, which also function to prevent perturbations induced by drying in vitro and in heterologous systems. CAHS proteins have been shown to form gels both in vitro and in vivo, which has been speculated to be linked to their protective capacity. However, the sequence features and mechanisms underlying gel formation and the necessity of gelation for protection have not been demonstrated. Here we report a mechanism of fibrillization and gelation for CAHS D similar to that of intermediate filament assembly. We show that in vitro, gelation restricts molecular motion, immobilizing and protecting labile material from the harmful effects of drying. In vivo, we observe that CAHS D forms fibrillar networks during osmotic stress. Fibrillar networking of CAHS D improves survival of osmotically shocked cells. We observe two emergent properties associated with fibrillization; (i) prevention of cell volume change and (ii) reduction of metabolic activity during osmotic shock. We find that there is no significant correlation between maintenance of cell volume and survival, while there is a significant correlation between reduced metabolism and survival. Importantly, CAHS D's fibrillar network formation is reversible and metabolic rates return to control levels after CAHS fibers are resolved. This work provides insights into how tardigrades induce reversible biostasis through the self‐assembly of labile CAHS gels. 
    more » « less
  5. Proteins must be hydrated to function. Desiccation, a common event in an increasing number of ecosystems, can drive proteome-wide unfolding and aggregation. For cells to survive, proteins must disaggregate and retain their function upon rehydration. The molecular determinants that underlie protein desiccation resistance remain unknown. Here, we use mass spectrometry to show that some proteins possess an innate ability to survive dehydration and subsequent rehydration. Structural analysis correlates the ability of proteins to resist desiccation with their surface area chemistry. Remarkably, highly resistant proteins are responsible for the production of the cell's building blocks - amino acids, metabolites, and sugars. Conversely, those proteins that are desiccation-sensitive are responsible for ribosome biogenesis. As a result, the rehydrated proteome is preferentially enriched with metabolite and small molecule producers and depleted of ribosomes - the cell's heaviest consumers. We propose this functional bias allows cells to kickstart their metabolism and promote cell survival upon rehydration. 
    more » « less
  6. The conformational ensemble and function of intrinsically disordered proteins (IDPs) are sensitive to their solution environment. The inherent malleability of disordered proteins combined with the exposure of their residues accounts for this sensitivity. One context in which IDPs play important roles that is concomitant with massive changes to the intracellular environment is during desiccation (extreme drying). The ability of organisms to survive desiccation has long been linked to the accumulation of high levels of cosolutes such as trehalose or sucrose as well as the enrichment of IDPs, such as late embryogenesis abundant (LEA) proteins or cytoplasmic abundant heat soluble (CAHS) proteins. Despite knowing that IDPs play important roles and are co-enriched alongside endogenous, species-specific cosolutes during desiccation, little is known mechanistically about how IDP-cosolute interactions influence desiccation tolerance. Here, we test the notion that the protective function of desiccation-related IDPs is enhanced through conformational changes induced by endogenous cosolutes. We find that desiccation-related IDPs derived from four different organisms spanning two LEA protein families and the CAHS protein family, synergize best with endogenous cosolutes during drying to promote desiccation protection. Yet the structural parameters of protective IDPs do not correlate with synergy for either CAHS or LEA proteins. We further demonstrate that for CAHS, but not LEA proteins, synergy is related to self-assembly and the formation of a gel. Our results demonstrate that functional synergy between IDPs and endogenous cosolutes is a convergent desiccation protection strategy seen among different IDP families and organisms, yet, the mechanisms underlying this synergy differ between IDP families. 
    more » « less
  7. Denatured, unfolded, and intrinsically disordered proteins (collectively referred to here as unfolded proteins) can be described using analytical polymer models. These models capture various polymeric properties and can be fit to simulation results or experimental data. However, the model parameters commonly require users’ decisions, making them useful for data interpretation but less clearly applicable as stand-alone reference models. Here we use all-atom simulations of polypeptides in conjunction with polymer scaling theory to parameterize an analytical model of unfolded polypeptides that behave as ideal chains (ν = 0.50). The model, which we call the analytical Flory random coil (AFRC), requires only the amino acid sequence as input and provides direct access to probability distributions of global and local conformational order parameters. The model defines a specific reference state to which experimental and computational results can be compared and normalized. As a proof-of-concept, we use the AFRC to identify sequence-specific intramolecular interactions in simulations of disordered proteins. We also use the AFRC to contextualize a curated set of 145 different radii of gyration obtained from previously published small-angle X-ray scattering experiments of disordered proteins. The AFRC is implemented as a stand-alone software package and is also available via a Google Colab notebook. In summary, the AFRC provides a simple-to-use reference polymer model that can guide intuition and aid in interpreting experimental or simulation results. 
    more » « less