- Award ID(s):
- 1803786
- PAR ID:
- 10165817
- Date Published:
- Journal Name:
- The Journal of Physical Chemistry Letters
- Volume:
- 10
- Issue:
- 18
- ISSN:
- 1948-7185
- Page Range / eLocation ID:
- 5667 to 5673
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
Abstract Structures of proteins and protein–protein complexes are determined by the same physical principles and thus share a number of similarities. At the same time, there could be differences because in order to function, proteins interact with other molecules, undergo conformations changes, and so forth, which might impose different restraints on the tertiary versus quaternary structures. This study focuses on structural properties of protein–protein interfaces in comparison with the protein core, based on the wealth of currently available structural data and new structure‐based approaches. The results showed that physicochemical characteristics, such as amino acid composition, residue–residue contact preferences, and hydrophilicity/hydrophobicity distributions, are similar in protein core and protein–protein interfaces. On the other hand, characteristics that reflect the evolutionary pressure, such as structural composition and packing, are largely different. The results provide important insight into fundamental properties of protein structure and function. At the same time, the results contribute to better understanding of the ways to dock proteins. Recent progress in predicting structures of individual proteins follows the advancement of deep learning techniques and new approaches to residue coevolution data. Protein core could potentially provide large amounts of data for application of the deep learning to docking. However, our results showed that the core motifs are significantly different from those at protein–protein interfaces, and thus may not be directly useful for docking. At the same time, such difference may help to overcome a major obstacle in application of the coevolutionary data to docking—discrimination of the intramolecular information not directly relevant to docking.
-
Abstract Biochemical methods can reveal stable protein‐protein interactions occurring within cells, but the ability to observe transient events and to visualize the subcellular localization of protein‐protein interactions in cells and tissues in situ provides important additional information. The Proximity Ligation Assay®(PLA) offers the opportunity to visualize the subcellular location of such interactions at endogenous protein levels, provided that the probes that recognize the target proteins are within 40 nm. This sensitive technique not only elucidates protein‐protein interactions, but also can reveal post‐translational protein modifications. The technique is useful even in cases where material is limited, such as when paraffin‐embedded clinical specimens are the only available material, as well as after experimental intervention in 2D and 3D model systems. Here we describe the basic protocol for using the commercially available Proximity Ligation Assay™ materials (Sigma‐Aldrich, St. Louis, MO), and incorporate details to aid the researcher in successfully performing the experiments. © 2020 Wiley Periodicals LLC.
Basic Protocol 1 : Proximity ligation assaySupport Protocol 1 : Antigen retrieval method for formalin‐fixed, paraffin‐embedded tissuesSupport Protocol 2 : Creation of custom PLA probes using the Duolink™ In Situ Probemaker Kit when commercially available probes are not suitableBasic Protocol 2 : Imaging, quantification, and analysis of PLA signals -
Abstract Motivation Protein language models based on the transformer architecture are increasingly improving performance on protein prediction tasks, including secondary structure, subcellular localization, and more. Despite being trained only on protein sequences, protein language models appear to implicitly learn protein structure. This paper investigates whether sequence representations learned by protein language models encode structural information and to what extent.
Results We address this by evaluating protein language models on remote homology prediction, where identifying remote homologs from sequence information alone requires structural knowledge, especially in the “twilight zone” of very low sequence identity. Through rigorous testing at progressively lower sequence identities, we profile the performance of protein language models ranging from millions to billions of parameters in a zero-shot setting. Our findings indicate that while transformer-based protein language models outperform traditional sequence alignment methods, they still struggle in the twilight zone. This suggests that current protein language models have not sufficiently learned protein structure to address remote homology prediction when sequence signals are weak.
Availability and implementation We believe this opens the way for further research both on remote homology prediction and on the broader goal of learning sequence- and structure-rich representations of protein molecules. All code, data, and models are made publicly available.