skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.
Attention:The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 7:00 AM ET to 7:30 AM ET on Friday, April 24 due to maintenance. We apologize for the inconvenience.


Title: Protein codes promote selective subcellular compartmentalization
Cells have evolved mechanisms to distribute ~10 billion protein molecules to subcellular compartments where diverse proteins involved in shared functions must assemble. In this study, we demonstrate that proteins with shared functions share amino acid sequence codes that guide them to compartment destinations. We developed a protein language model, ProtGPS, that predicts with high performance the compartment localization of human proteins excluded from the training set. ProtGPS successfully guided generation of novel protein sequences that selectively assemble in the nucleolus. ProtGPS identified pathological mutations that change this code and lead to altered subcellular localization of proteins. Our results indicate that protein sequences contain not only a folding code but also a previously unrecognized code governing their distribution to diverse subcellular compartments.  more » « less
Award ID(s):
2044895
PAR ID:
10667665
Author(s) / Creator(s):
 ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  
Publisher / Repository:
Science
Date Published:
Journal Name:
Science
Volume:
387
Issue:
6738
ISSN:
0036-8075
Page Range / eLocation ID:
1095 to 1101
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Abstract Mammalian cells process information through coordinated spatiotemporal regulation of proteins. Engineering cellular networks thus relies on efficient tools for regulating protein levels in specific subcellular compartments. To address the need to manipulate the extent and dynamics of protein localization, we developed a platform technology for the target-specific control of protein destination. This platform is based on bifunctional molecules comprising a target-specific nanobody and universal sequences determining target subcellular localization or degradation rate. We demonstrate that nanobody-mediated localization depends on the expression level of the target and the nanobody, and the extent of target subcellular localization can be regulated by combining multiple target-specific nanobodies with distinct localization or degradation sequences. We also show that this platform for nanobody-mediated target localization and degradation can be regulated transcriptionally and integrated within orthogonal genetic circuits to achieve the desired temporal control over spatial regulation of target proteins. The platform reported in this study provides an innovative tool to control protein subcellular localization, which will be useful to investigate protein function and regulate large synthetic gene circuits. 
    more » « less
  2. Abstract Subcellular compartmentalization is a universal feature of all cells. Spatially distinct compartments, be they lipid‐ or protein‐based, enable cells to optimize local reaction environments, store nutrients, and sequester toxic processes. Prokaryotes generally lack intracellular membrane systems and usually rely on protein‐based compartments and organelles to regulate and optimize their metabolism. Encapsulins are one of the most diverse and widespread classes of prokaryotic protein compartments. They self‐assemble into icosahedral protein shells and are able to specifically internalize dedicated cargo enzymes. This review discusses the structural diversity of encapsulin protein shells, focusing on shell assembly, symmetry, and dynamics. The properties and functions of pores found within encapsulin shells will also be discussed. In addition, fusion and insertion domains embedded within encapsulin shell protomers will be highlighted. Finally, future research directions for basic encapsulin biology, with a focus on the structural understand of encapsulins, are briefly outlined. 
    more » « less
  3. The 3′ untranslated regions (3′ UTRs) of mRNAs serve as hubs for post-transcriptional control as the targets of microRNAs (miRNAs) and RNA-binding proteins (RBPs). Sequences in 3′ UTRs confer alterations in mRNA stability, direct mRNA localization to subcellular regions, and impart translational control. Thousands of mRNAs are localized to subcellular compartments in neurons—including axons, dendrites, and synapses—where they are thought to undergo local translation. Despite an established role for 3′ UTR sequences in imparting mRNA localization in neurons, the specific RNA sequences and structural features at play remain poorly understood. The nervous system selectively expresses longer 3′ UTR isoforms via alternative polyadenylation (APA). The regulation of APA in neurons and the neuronal functions of longer 3′ UTR mRNA isoforms are starting to be uncovered. Surprising roles for 3′ UTRs are emerging beyond the regulation of protein synthesis and include roles as RBP delivery scaffolds and regulators of alternative splicing. Evidence is also emerging that 3′ UTRs can be cleaved, leading to stable, isolated 3′ UTR fragments which are of unknown function. Mutations in 3′ UTRs are implicated in several neurological disorders—more studies are needed to uncover how these mutations impact gene regulation and what is their relationship to disease severity. 
    more » « less
  4. Background: Long non-coding Ribonucleic Acids (lncRNAs) can be localized to different cellular compartments, such as the nuclear and the cytoplasmic regions. Their biological functions are influenced by the region of the cell where they are located. Compared to the vast number of lncRNAs, only a relatively small proportion have annotations regarding their subcellular localization. It would be helpful if those few annotated lncRNAs could be leveraged to develop predictive models for localization of other lncRNAs. Methods: Conventional computational methods use q-mer profiles from lncRNA sequences and train machine learning models such as support vector machines and logistic regression with the profiles. These methods focus on the exact q-mer. Given possible sequence mutations and other uncertainties in genomic sequences and their role in biological function, a consideration of these variabilities might improve our ability to model lncRNAs and their localization. Thus, we build on inexact q-mers and use machine learning/deep learning techniques to study three specific problems in lncRNA subcellular localization, namely, prediction of lncRNA localization using inexact q-mers, the issue of whether lncRNA localization is cell-type-specific, and the notion of switching (lncRNA) genes. Results: We performed our analysis using data on lncRNA localization across 15 cell lines. Our results showed that using inexact q-mers (with q = 6) can improve the lncRNA localization prediction performance compared to using exact q-mers. Further, we showed that lncRNA localization, in general, is not cell-line-specific. We also identified a category of LncRNAs which switch cellular compartments between different cell lines (we call them switching lncRNAs). These switching lncRNAs complicate the problem of predicting lncRNA localization using machine learning models, showing that lncRNA localization is still a major challenge. 
    more » « less
  5. The phosphoregulation of proteins with multiple phosphorylation sites is governed by biochemical reaction networks that can exhibit multistable behavior. However, the behavior of such networks is typically studied in a single reaction volume, while cells are spatially organized into compartments that can exchange proteins. In this work, we use stochastic simulations to study the impact of compartmentalization on a two-site phosphorylation network. We characterize steady states and fluctuation-driven transitions between them as a function of the rate of protein exchange between two compartments. Surprisingly, the average time spent in a state before stochastically switching to another depends nonmonotonically on the protein exchange rate, with the most frequent switching occurring at intermediate exchange rates. At sufficiently small exchange rates, the state of the system and mean switching time are controlled largely by fluctuations in the balance of enzymes in each compartment. This leads to negatively correlated states in the compartments. For large exchange rates, the two compartments behave as a single effective compartment. However, when the compartmental volumes are unequal, the behavior differs from a single compartment with the same total volume. These results demonstrate that exchange of proteins between distinct compartments can regulate the emergent behavior of a common signaling motif. 
    more » « less