skip to main content

Title: Many-to-one binding by intrinsically disordered protein regions
Disordered binding regions (DBRs), which are embedded within intrinsically disordered proteins or regions (IDPs or IDRs), enable IDPs or IDRs to mediate multiple protein-protein interactions. DBR-protein complexes were collected from the Protein Data Bank for which two or more DBRs having different amino acid sequences bind to the same (100% sequence identical) globular protein partner, a type of interaction herein called many-to-one binding. Two distinct binding profiles were identified: independent and overlapping. For the overlapping binding profiles, the distinct DBRs interact by means of almost identical binding sites (herein called “similar”), or the binding sites contain both common and divergent interaction residues (herein called “intersecting”). Further analysis of the sequence and structural differences among these three groups indicate how IDP flexibility allows different segments to adjust to similar, intersecting, and independent binding pockets.
; ; ; ; ; ; ; ; ;
Award ID(s):
Publication Date:
Journal Name:
Pacific symposium on biocomputing
Page Range or eLocation-ID:
Sponsoring Org:
National Science Foundation
More Like this
  1. Cells adapt and respond to changes by regulating the activity of their genes. To turn genes on or off, they use a family of proteins called transcription factors. Transcription factors influence specific but overlapping groups of genes, so that each gene is controlled by several transcription factors that act together like a dimmer switch to regulate gene activity. The presence of transcription factors attracts proteins such as the Mediator complex, which activates genes by gathering the protein machines that read the genes. The more transcription factors are found near a specific gene, the more strongly they attract Mediator and themore »more active the gene is. A specific region on the transcription factor called the activation domain is necessary for this process. The biochemical sequences of these domains vary greatly between species, yet activation domains from, for example, yeast and human proteins are often interchangeable. To understand why this is the case, Sanborn et al. analyzed the genome of baker’s yeast and identified 150 activation domains, each very different in sequence. Three-quarters of them bound to a subunit of the Mediator complex called Med15. Sanborn et al. then developed a machine learning algorithm to predict activation domains in both yeast and humans. This algorithm also showed that negatively charged and greasy regions on the activation domains were essential to be activated by the Mediator complex. Further analyses revealed that activation domains used different poses to bind multiple sites on Med15, a behavior known as ‘fuzzy’ binding. This creates a high overall affinity even though the binding strength at each individual site is low, enabling the protein complexes to remain dynamic. These weak interactions together permit fine control over the activity of several genes, allowing cells to respond quickly and precisely to many changes. The computer algorithm used here provides a new way to identify activation domains across species and could improve our understanding of how living things grow, adapt and evolve. It could also give new insights into mechanisms of disease, particularly cancer, where transcription factors are often faulty.« less
  2. Dutch, Rebecca Ellis. (Ed.)
    ABSTRACT Opium poppy mosaic virus (OPMV) is a recently discovered umbravirus in the family Tombusviridae . OPMV has a plus-sense genomic RNA (gRNA) of 4,241 nucleotides (nt) from which replication protein p35 and p35 extension product p98, the RNA-dependent RNA polymerase (RdRp), are expressed. Movement proteins p27 (long distance) and p28 (cell to cell) are expressed from a 1,440-nt subgenomic RNA (sgRNA2). A highly conserved structure was identified just upstream from the sgRNA2 transcription start site in all umbraviruses, which includes a carmovirus consensus sequence, denoting generation by an RdRp-mediated mechanism. OPMV also has a second sgRNA of 1,554 nt (sgRNA1) thatmore »starts just downstream of a canonical exoribonuclease-resistant sequence (xrRNA D ). sgRNA1 codes for a 30-kDa protein in vitro that is in frame with p28 and cannot be synthesized in other umbraviruses. Eliminating sgRNA1 or truncating the p30 open reading frame (ORF) without affecting p28 substantially reduced accumulation of OPMV gRNA, suggesting a functional role for the protein. The 652-nt 3′ untranslated region of OPMV contains two 3′ cap-independent translation enhancers (3′ CITEs), a T-shaped structure (TSS) near its 3′ end, and a Barley yellow dwarf virus -like translation element (BTE) in the central region. Only the BTE is functional in luciferase reporter constructs containing gRNA or sgRNA2 5′ sequences in vivo , which differs from how umbravirus 3′ CITEs were used in a previous study. Similarly to most 3′ CITEs, the OPMV BTE links to the 5′ end via a long-distance RNA-RNA interaction. Analysis of 14 BTEs revealed additional conserved sequences and structural features beyond the previously identified 17-nt conserved sequence. IMPORTANCE Opium poppy mosaic virus (OPMV) is an umbravirus in the family Tombusviridae . We determined that OPMV accumulates two similarly sized subgenomic RNAs (sgRNAs), with the smaller known to code for proteins expressed from overlapping open reading frames. The slightly larger sgRNA1 has a 5′ end just upstream from a previously predicted xrRNA D site, identifying this sgRNA as an unusually long product produced by exoribonuclease trimming. Although four umbraviruses have similar predicted xrRNA D sites, only sgRNA1 of OPMV can code for a protein that is an extension product of umbravirus ORF4. Inability to generate the sgRNA or translate this protein was associated with reduced gRNA accumulation in vivo . We also characterized the OPMV BTE structure, a 3′ cap-independent translation enhancer (3′ CITE). Comparisons of 13 BTEs with the OPMV BTE revealed additional stretches of sequence similarity beyond the 17-nt signature sequence, as well as conserved structural features not previously recognized in these 3′ CITEs.« less
  3. Musier-Forsyth, Karin (Ed.)
    RNA-binding proteins play crucial roles in various cellular functions, and contain abundant disordered protein regions. The disordered regions in RNA-binding proteins are rich in repetitive sequences, such as poly-K/R, poly-N/Q, poly-A, and poly-G residues. Our bioinformatic analysis identified a largely neglected repetitive sequence family we define as electronegative clusters (ENCs) that contain acidic residues and/or phosphorylation sites. The abundance and length of ENCs exceed other known repetitive sequences. Despite their abundance, the functions of ENCs in RNA-binding proteins are still elusive. To investigate the impacts of ENCs on protein stability, RNA-binding affinity, and specificity, we selected one RNA-binding protein, themore »ribosomal biogenesis factor 15 (Nop15) as a model. We found that the Nop15 ENC increases protein stability and inhibits nonspecific RNA binding, but minimally interferes with specific RNA binding. To investigate the effect of ENCs on sequence specificity of RNA binding, we grafted an ENC to another RNA-binding protein, Ser/Arg-rich splicing factor 3 (SRSF3). Using RNA Bind-n-Seq, we found that the engineered ENC inhibits disparate RNA motifs differently, instead of weakening all RNA motifs to the same extent. The motif site directly involved in electrostatic interaction is more susceptible to the ENC inhibition. These results suggest that one of functions of ENCs is to regulate RNA binding via electrostatic interaction. This is consistent with our finding that ENCs are also overrepresented in DNA-binding proteins, while underrepresented in halophiles, in which nonspecific nucleic acid binding is inhibited by high concentrations of salts.« less
  4. Phase separation of intrinsically disordered proteins (IDPs) commonly underlies the formation of membraneless organelles, which compartmentalize molecules intracellularly in the absence of a lipid membrane. Identifying the protein sequence features responsible for IDP phase separation is critical for understanding physiological roles and pathological consequences of biomolecular condensation, as well as for harnessing phase separation for applications in bioinspired materials design. To expand our knowledge of sequence determinants of IDP phase separation, we characterized variants of the intrinsically disordered RGG domain from LAF-1, a model protein involved in phase separation and a key component of P granules. Based on a predictivemore »coarse-grained IDP model, we identified a region of the RGG domain that has high contact probability and is highly conserved between species; deletion of this region significantly disrupts phase separation in vitro and in vivo. We determined the effects of charge patterning on phase behavior through sequence shuffling. We designed sequences with significantly increased phase separation propensity by shuffling the wild-type sequence, which contains well-mixed charged residues, to increase charge segregation. This result indicates the natural sequence is under negative selection to moderate this mode of interaction. We measured the contributions of tyrosine and arginine residues to phase separation experimentally through mutagenesis studies and computationally through direct interrogation of different modes of interaction using all-atom simulations. Finally, we show that despite these sequence perturbations, the RGG-derived condensates remain liquid-like. Together, these studies advance our fundamental understanding of key biophysical principles and sequence features important to phase separation.

    « less
  5. Intrinsically disordered regions (IDRs) in proteins are often targets of combinatorial posttranslational modifications, which serve to regulate protein structure and function. Emerging evidence suggests that the N-terminal tails of G protein γ subunits, which are essential components of heterotrimeric G proteins, are intrinsically disordered, phosphorylation-dependent determinants of G protein signaling. Here, we found that the yeast Gγ subunit Ste18 underwent combinatorial, multisite phosphorylation events within its N-terminal IDR. G protein–coupled receptor (GPCR) activation and osmotic stress induced phosphorylation at Ser7, whereas glucose and acid stress induced phosphorylation at Ser3, which was a quantitative indicator of intracellular pH. Each site wasmore »phosphorylated by a distinct set of kinases, and phosphorylation of one site affected phosphorylation of the other, as determined through exposure to serial stimuli and through phosphosite mutagenesis. Last, we showed that phosphorylation resulted in changes in IDR structure and that different combinations of phosphorylation events modulated the activation rate and amplitude of the downstream mitogen-activated protein kinase Fus3. These data place Gγ subunits among intrinsically disordered proteins that undergo combinatorial posttranslational modifications that govern signaling pathway output.

    « less