The ability to accurately predict protein–protein interactions is critically important for understanding major cellular processes. However, current experimental and computational approaches for identifying them are technically very challenging and still have limited success. We propose a new computational method for predicting protein–protein interactions using only primary sequence information. It utilizes the concept of physicochemical similarity to determine which interactions will most likely occur. In our approach, the physicochemical features of proteins are extracted using bioinformatics tools for different organisms. Then they are utilized in a machine-learning method to identify successful protein–protein interactions via correlation analysis. It was found that the most important property that correlates most with the protein–protein interactions for all studied organisms is dipeptide amino acid composition (the frequency of specific amino acid pairs in a protein sequence). While current approaches often overlook the specificity of protein–protein interactions with different organisms, our method yields context-specific features that determine protein–protein interactions. The analysis is specifically applied to the bacterial two-component system that includes histidine kinase and transcriptional response regulators, as well as to the barnase–barstar complex, demonstrating the method’s versatility across different biological systems. Our approach can be applied to predict protein–protein interactions in any biological system, providing an important tool for investigating complex biological processes’ mechanisms.
more »
« less
Quantitative approaches for decoding the specificity of the human T cell repertoire
T cell receptor (TCR)-peptide-major histocompatibility complex (pMHC) interactions play a vital role in initiating immune responses against pathogens, and the specificity of TCRpMHC interactions is crucial for developing optimized therapeutic strategies. The advent of high-throughput immunological and structural evaluation of TCR and pMHC has provided an abundance of data for computational approaches that aim to predict favorable TCR-pMHC interactions. Current models are constructed using information on protein sequence, structures, or a combination of both, and utilize a variety of statistical learning-based approaches for identifying the rules governing specificity. This review examines the current theoretical, computational, and deep learning approaches for identifying TCR-pMHC recognition pairs, placing emphasis on each method’s mathematical approach, predictive performance, and limitations.
more »
« less
- Award ID(s):
- 2019745
- PAR ID:
- 10512292
- Editor(s):
- Antunes, Dinler Amaral
- Publisher / Repository:
- Frontiers in Immunology
- Date Published:
- Journal Name:
- Frontiers in Immunology
- Volume:
- 14
- ISSN:
- 1664-3224
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Reliable prediction of T cell specificity against antigenic signatures is a formidable task, complicated by the immense diversity of T cell receptor and antigen sequence space and the resulting limited availability of training sets for inferential models. Recent modeling efforts have demonstrated the advantage of incorporating structural information to overcome the need for extensive training sequence data, yet disentangling the heterogeneous TCR-antigen interface to accurately predict MHC-allele-restricted TCR-peptide interactions has remained challenging. Here, we present RACER-m, a coarse-grained structural model leveraging key biophysical information from the diversity of publicly available TCR-antigen crystal structures. Explicit inclusion of structural content substantially reduces the required number of training examples and maintains reliable predictions of TCR-recognition specificity and sensitivity across diverse biological contexts. Our model capably identifies biophysically meaningful point-mutant peptides that affect binding affinity, distinguishing its ability in predicting TCR specificity of point-mutants from alternative sequence-based methods. Its application is broadly applicable to studies involving both closely related and structurally diverse TCR-peptide pairs.more » « less
-
The diverse T cell receptor (TCR) repertoire confers the ability to recognize an almost unlimited array of antigens. Characterization of antigen specificity of tumor-infiltrating lymphocytes (TILs) is key for understanding antitumor immunity and for guiding the development of effective immunotherapies. Here, we report a large-scale comprehensive examination of the TCR landscape of TILs across the spectrum of pediatric brain tumors, the leading cause of cancer-related mortality in children. We show that a T cell clonality index can inform patient prognosis, where more clonality is associated with more favorable outcomes. Moreover, TCR similarity groups’ assessment revealed patient clusters with defined human leukocyte antigen associations. Computational analysis of these clusters identified putative tumor antigens and peptides as targets for antitumor T cell immunity, which were functionally validated by T cell stimulation assays in vitro. Together, this study presents a framework for tumor antigen prediction based on in situ and in silico TIL TCR analyses. We propose that TCR-based investigations should inform tumor classification and precision immunotherapy development.more » « less
-
T cell receptor (TCR) studies have grown substantially with the advancement in the sequencing techniques of T cell receptor repertoire sequencing (TCR-Seq). The analysis of the TCR-Seq data requires computational skills to run the computational analysis of TCR repertoire tools. However biomedical researchers with limited computational backgrounds face numerous obstacles to properly and efficiently utilizing bioinformatics tools for analyzing TCR-Seq data. Here we report pyTCR, a computational notebook-based solution for comprehensive and scalable TCR-Seq data analysis. Computational notebooks, which combine code, calculations, and visualization, are able to provide users with a high level of flexibility and transparency for the analysis. Additionally, computational notebooks are demonstrated to be user-friendly and suitable for researchers with limited computational skills. Our tool has a rich set of functionalities including various TCR metrics, statistical analysis, and customizable visualizations. The application of pyTCR on large and diverse TCR-Seq datasets will enable the effective analysis of large-scale TCR-Seq data with flexibility, and eventually facilitate new discoveries.more » « less
-
Tropical cyclone rainfall (TCR) extensively affects coastal communities, primarily through inland flooding. The impact of global climate changes on TCR is complex and debatable. This study uses an XGBoost machine learning model with 19-year meteorological data and hourly satellite precipitation observations to predict TCR for individual storms. The model identifies dust optical depth (DOD) as a key predictor that enhances performance evidently. The model also uncovers a nonlinear and boomerang-shape relationship between Saharan dust and TCR, with a TCR peak at 0.06 DOD and a sharp decrease thereafter. This indicates a shift from microphysical enhancement to radiative suppression at high dust concentrations. The model also highlights meaningful correlations between TCR and meteorological factors like sea surface temperature and equivalent potential temperature near storm cores. These findings illustrate the effectiveness of machine learning in predicting TCR and understanding its driving factors and physical mechanisms.more » « less
An official website of the United States government

