skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Non-covalent Lasso Entanglements in Folded Proteins: Prevalence, Functional Implications, and Evolutionary Significance
One-third of protein domains in the CATH database contain a recently discovered tertiary topological motif: non-covalent lasso entanglements, in which a segment of the protein backbone forms a loop closed by non-covalent interactions between residues and is threaded one or more times by the N- or C-terminal backbone segment. Unknown is how frequently this structural motif appears across the proteomes of organisms. And the correlation of these motifs with various classes of protein function and biological processes have not been quantified. Here, using a combination of protein crystal structures, AlphaFold2 predictions, and Gene Ontology terms we show that in E. coli, S. cerevisiae and H. sapiens that 71%, 52% and 49% of globular proteins contain one-or-more non-covalent lasso entanglements in their native fold, and that some of these are highly complex with multiple threading events. Further, proteins containing these tertiary motifs are consistently enriched in certain functions and biological processes across these organisms and depleted in others, strongly indicating an influence of evolutionary selection pressures acting positively and negatively on the distribution of these motifs. Together, these results demonstrate that non-covalent lasso entanglements are widespread and indicate they may be extensively utilized for protein function and subcellular processes, thus impacting phenotype.  more » « less
Award ID(s):
2031584
PAR ID:
10616339
Author(s) / Creator(s):
; ; ; ; ;
Publisher / Repository:
Elsevier
Date Published:
Journal Name:
Journal of Molecular Biology
Volume:
436
Issue:
6
ISSN:
0022-2836
Page Range / eLocation ID:
168459
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract MotivationGiven a protein of unknown function, fast identification of similar protein structures from the Protein Data Bank (PDB) is a critical step for inferring its biological function. Such structural neighbors can provide evolutionary insights into protein conformation, interfaces and binding sites that are not detectable from sequence similarity. However, the computational cost of performing pairwise structural alignment against all structures in PDB is prohibitively expensive. Alignment-free approaches have been introduced to enable fast but coarse comparisons by representing each protein as a vector of structure features or fingerprints and only computing similarity between vectors. As a notable example, FragBag represents each protein by a ‘bag of fragments’, which is a vector of frequencies of contiguous short backbone fragments from a predetermined library. Despite being efficient, the accuracy of FragBag is unsatisfactory because its backbone fragment library may not be optimally constructed and long-range interacting patterns are omitted. ResultsHere we present a new approach to learning effective structural motif presentations using deep learning. We develop DeepFold, a deep convolutional neural network model to extract structural motif features of a protein structure. We demonstrate that DeepFold substantially outperforms FragBag on protein structural search on a non-redundant protein structure database and a set of newly released structures. Remarkably, DeepFold not only extracts meaningful backbone segments but also finds important long-range interacting motifs for structural comparison. We expect that DeepFold will provide new insights into the evolution and hierarchical organization of protein structural motifs. Availability and implementationhttps://github.com/largelymfs/DeepFold 
    more » « less
  2. null (Ed.)
    Hydrogels comprise a class of soft materials which are extremely useful in a number of contexts, for example as matrix-mimetic biomaterials for applications in regenerative medicine and drug delivery. One particular subclass of hydrogels consists of materials prepared through non-covalent physical crosslinking afforded by supramolecular recognition motifs. The dynamic, reversible, and equilibrium-governed features of these molecular-scale motifs often transcend length-scales to endow the resulting hydrogels with these same properties on the bulk scale. In efforts to engineer hydrogels of all types with more precise or application-specific uses, inclusion of stimuli-responsive sol–gel transformations has been broadly explored. In the context of biomedical uses, temperature is an interesting stimulus which has been the focus of numerous hydrogel designs, supramolecular or otherwise. Most supramolecular motifs are inherently temperature-sensitive, with elevated temperatures commonly disfavoring motif formation and/or accelerating its dissociation. In addition, supramolecular motifs have also been incorporated for physical crosslinking in conjunction with polymeric or macromeric building blocks which themselves exhibit temperature-responsive changes to their properties. Through molecular-scale engineering of supramolecular recognition, and selection of a particular motif or polymeric/macromeric backbone, it is thus possible to devise a number of supramolecular hydrogel materials to empower a variety of future biomedical applications. 
    more » « less
  3. Abstract Organisms from all kingdoms of life depend on Late Embryogenesis Abundant (LEA) proteins to survive desiccation. LEA proteins are divided into broad families distinguished by the presence of family‐specific motif sequences. The LEA_4 family, characterized by 11‐residue motifs, plays a crucial role in the desiccation tolerance of numerous species. However, the role of these motifs in the function of LEA_4 proteins is unclear, with some studies finding that they recapitulate the function of full‐length LEA_4 proteins in vivo, and other studies finding the opposite result. In this study, we characterize the ability of LEA_4 motifs to protect a desiccation‐sensitive enzyme, citrate synthase (CS), from loss of function during desiccation. We show here that LEA_4 motifs not only prevent the loss of function of CS during desiccation but also that they can do so more robustly via synergistically interactions with cosolutes. Our analysis further suggests that cosolutes induce synergy with LEA_4 motifs in a manner that correlates with transfer free energy. This research advances our understanding of LEA_4 proteins by demonstrating that during desiccation their motifs can protect specific clients to varying degrees and that their protective capacity is modulated by their chemical environment. Our findings extend beyond the realm of desiccation tolerance, offering insights into the interplay between IDPs and cosolutes. By investigating the function of LEA_4 motifs, we highlight broader strategies for understanding protein stability and function. 
    more » « less
  4. Musier-Forsyth, Karin (Ed.)
    RNA-binding proteins play crucial roles in various cellular functions, and contain abundant disordered protein regions. The disordered regions in RNA-binding proteins are rich in repetitive sequences, such as poly-K/R, poly-N/Q, poly-A, and poly-G residues. Our bioinformatic analysis identified a largely neglected repetitive sequence family we define as electronegative clusters (ENCs) that contain acidic residues and/or phosphorylation sites. The abundance and length of ENCs exceed other known repetitive sequences. Despite their abundance, the functions of ENCs in RNA-binding proteins are still elusive. To investigate the impacts of ENCs on protein stability, RNA-binding affinity, and specificity, we selected one RNA-binding protein, the ribosomal biogenesis factor 15 (Nop15) as a model. We found that the Nop15 ENC increases protein stability and inhibits nonspecific RNA binding, but minimally interferes with specific RNA binding. To investigate the effect of ENCs on sequence specificity of RNA binding, we grafted an ENC to another RNA-binding protein, Ser/Arg-rich splicing factor 3 (SRSF3). Using RNA Bind-n-Seq, we found that the engineered ENC inhibits disparate RNA motifs differently, instead of weakening all RNA motifs to the same extent. The motif site directly involved in electrostatic interaction is more susceptible to the ENC inhibition. These results suggest that one of functions of ENCs is to regulate RNA binding via electrostatic interaction. This is consistent with our finding that ENCs are also overrepresented in DNA-binding proteins, while underrepresented in halophiles, in which nonspecific nucleic acid binding is inhibited by high concentrations of salts. 
    more » « less
  5. Clathrin-mediated endocytosis depends on polymerization of a branched actin network to provide force for membrane invagination. A key regulator in branched actin network formation is actin capping protein (CP), which binds to the barbed end of actin filaments to prevent the addition or loss of actin subunits. CP was thought to stochastically bind actin filaments, but recent evidence shows CP is regulated by a group of proteins containing CP-interacting (CPI) motifs. Importantly, how CPI motif proteins function together to regulate CP is poorly understood. Here, we show Aim21 and Bsp1 work synergistically to recruit CP to the endocytic actin network in budding yeast through their CPI motifs, which also allosterically modulate capping strength. In contrast, twinfilin works downstream of CP recruitment, regulating the turnover of CP through its CPI motif and a non-allosteric mechanism. Collectively, our findings reveal how three CPI motif proteins work together to regulate CP in a stepwise fashion during endocytosis. 
    more » « less