skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: An in silico proteomics screen to predict and prioritize protein–protein interactions dependent on post-translationally modified motifs
Abstract MotivationThe development of proteomic methods for the characterization of domain/motif interactions has greatly expanded our understanding of signal transduction. However, proteomics-based binding screens have limitations including that the queried tissue or cell type may not harbor all potential interacting partners or post-translational modifications (PTMs) required for the interaction. Therefore, we sought a generalizable, complementary in silico approach to identify potentially novel motif and PTM-dependent binding partners of high priority. ResultsWe used as an initial example the interaction between the Src homology 2 (SH2) domains of the adaptor proteins CT10 regulator of kinase (CRK) and CRK-like (CRKL) and phosphorylated-YXXP motifs. Employing well-curated, publicly-available resources, we scored and prioritized potential CRK/CRKL–SH2 interactors possessing signature characteristics of known interacting partners. Our approach gave high priority scores to 102 of the >9000 YXXP motif-containing proteins. Within this 102 were 21 of the 25 curated CRK/CRKL–SH2-binding partners showing a more than 80-fold enrichment. Several predicted interactors were validated biochemically. To demonstrate generalized applicability, we used our workflow to predict protein–protein interactions dependent upon motif-specific arginine methylation. Our data demonstrate the applicability of our approach to, conceivably, any modular binding domain that recognizes a specific post-translationally modified motif. Supplementary informationSupplementary data are available at Bioinformatics online.  more » « less
Award ID(s):
1656510
PAR ID:
10393404
Author(s) / Creator(s):
; ; ; ; ; ;
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
Bioinformatics
Volume:
34
Issue:
22
ISSN:
1367-4803
Page Range / eLocation ID:
p. 3898-3906
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Motivationmetal-binding proteins have a central role in maintaining life processes. Nearly one-third of known protein structures contain metal ions that are used for a variety of needs, such as catalysis, DNA/RNA binding, protein structure stability, etc. Identifying metal-binding proteins is thus crucial for understanding the mechanisms of cellular activity. However, experimental annotation of protein metal-binding potential is severely lacking, while computational techniques are often imprecise and of limited applicability. Resultswe developed a novel machine learning-based method, mebipred, for identifying metal-binding proteins from sequence-derived features. This method is over 80% accurate in recognizing proteins that bind metal ion-containing ligands; the specific identity of 11 ubiquitously present metal ions can also be annotated. mebipred is reference-free, i.e. no sequence alignments are involved, and is thus faster than alignment-based methods; it is also more accurate than other sequence-based prediction methods. Additionally, mebipred can identify protein metal-binding capabilities from short sequence stretches, e.g. translated sequencing reads, and, thus, may be useful for the annotation of metal requirements of metagenomic samples. We performed an analysis of available microbiome data and found that ocean, hot spring sediments and soil microbiomes use a more diverse set of metals than human host-related ones. For human microbiomes, physiological conditions explain the observed metal preferences. Similarly, subtle changes in ocean sample ion concentration affect the abundance of relevant metal-binding proteins. These results highlight mebipred’s utility in analyzing microbiome metal requirements. Availability and implementationmebipred is available as a web server at services.bromberglab.org/mebipred and as a standalone package at https://pypi.org/project/mymetal/. Supplementary informationSupplementary data are available at Bioinformatics online. 
    more » « less
  2. Abstract Visualizing and measuring molecular-scale interactions in living cells represents a major challenge, but recent advances in single-molecule super-resolution microscopy are bringing us closer to achieving this goal. Single-molecule super-resolution microscopy enables high-resolution and sensitive imaging of the positions and movement of molecules in living cells. HP1 proteins are important regulators of gene expression because they selectively bind and recognize H3K9 methylated (H3K9me) histones to form heterochromatin-associated protein complexes that silence gene expression, but several important mechanistic details of this process remain unexplored. Here, we extended live-cell single-molecule tracking studies in fission yeast to determine how HP1 proteins interact with their binding partners in the nucleus. We measured how genetic perturbations that affect H3K9me alter the diffusive properties of HP1 proteins and their binding partners, and we inferred their most likely interaction sites. Our results demonstrate that H3K9 methylation spatially restricts HP1 proteins and their interactors, thereby promoting ternary complex formation on chromatin while simultaneously suppressing off-chromatin binding. As opposed to being an inert platform to direct HP1 binding, our studies propose a novel function for H3K9me in promoting ternary complex formation by enhancing the specificity and stimulating the assembly of HP1–protein complexes in living cells. 
    more » « less
  3. Abstract Monobodies are synthetic non-immunoglobulin customizable protein binders invaluable to basic and applied research, and of considerable potential as future therapeutics and diagnostic tools. The ability to reversibly control their binding activity to their targets on demand would significantly expand their applications in biotechnology, medicine, and research. Here we present, as proof-of-principle, the development of a light-controlled monobody (OptoMB) that works in vitro and in cells and whose affinity for its SH2-domain target exhibits a 330-fold shift in binding affinity upon illumination. We demonstrate that our αSH2-OptoMB can be used to purify SH2-tagged proteins directly from crudeE. coliextract, achieving 99.8% purity and over 40% yield in a single purification step. By virtue of their ability to be designed to bind any protein of interest, OptoMBs have the potential to find new powerful applications as light-switchable binders of untagged proteins with the temporal and spatial precision afforded by light. 
    more » « less
  4. Abstract MotivationMHC Class I protein plays an important role in immunotherapy by presenting immunogenic peptides to anti-tumor immune cells. The repertoires of peptides for various MHC Class I proteins are distinct, which can be reflected by their diverse binding motifs. To characterize binding motifs for MHC Class I proteins, in vitro experiments have been conducted to screen peptides with high binding affinities to hundreds of given MHC Class I proteins. However, considering tens of thousands of known MHC Class I proteins, conducting in vitro experiments for extensive MHC proteins is infeasible, and thus a more efficient and scalable way to characterize binding motifs is needed. ResultsWe presented a de novo generation framework, coined PepPPO, to characterize binding motif for any given MHC Class I proteins via generating repertoires of peptides presented by them. PepPPO leverages a reinforcement learning agent with a mutation policy to mutate random input peptides into positive presented ones. Using PepPPO, we characterized binding motifs for around 10 000 known human MHC Class I proteins with and without experimental data. These computed motifs demonstrated high similarities with those derived from experimental data. In addition, we found that the motifs could be used for the rapid screening of neoantigens at a much lower time cost than previous deep-learning methods. Availability and implementationThe software can be found in https://github.com/minrq/pMHC. Supplementary informationSupplementary data are available at Bioinformatics online. 
    more » « less
  5. Abstract MotivationGiven a protein of unknown function, fast identification of similar protein structures from the Protein Data Bank (PDB) is a critical step for inferring its biological function. Such structural neighbors can provide evolutionary insights into protein conformation, interfaces and binding sites that are not detectable from sequence similarity. However, the computational cost of performing pairwise structural alignment against all structures in PDB is prohibitively expensive. Alignment-free approaches have been introduced to enable fast but coarse comparisons by representing each protein as a vector of structure features or fingerprints and only computing similarity between vectors. As a notable example, FragBag represents each protein by a ‘bag of fragments’, which is a vector of frequencies of contiguous short backbone fragments from a predetermined library. Despite being efficient, the accuracy of FragBag is unsatisfactory because its backbone fragment library may not be optimally constructed and long-range interacting patterns are omitted. ResultsHere we present a new approach to learning effective structural motif presentations using deep learning. We develop DeepFold, a deep convolutional neural network model to extract structural motif features of a protein structure. We demonstrate that DeepFold substantially outperforms FragBag on protein structural search on a non-redundant protein structure database and a set of newly released structures. Remarkably, DeepFold not only extracts meaningful backbone segments but also finds important long-range interacting motifs for structural comparison. We expect that DeepFold will provide new insights into the evolution and hierarchical organization of protein structural motifs. Availability and implementationhttps://github.com/largelymfs/DeepFold 
    more » « less