EquiPNAS: improved protein–nucleic acid binding site prediction using protein-language-model-informed equivariant deep graph neural networks

Roche, Rahmatullah; Moussad, Bernard; Shuvo, Md Hossain; Tarafder, Sumit; Bhattacharya, Debswapna (ORCID:0000000296300141)

doi:10.1093/nar/gkae039

Citation Details

EquiPNAS: improved protein–nucleic acid binding site prediction using protein-language-model-informed equivariant deep graph neural networks

Abstract

Protein language models (pLMs) trained on a large corpus of protein sequences have shown unprecedented scalability and broad generalizability in a wide range of predictive modeling tasks, but their power has not yet been harnessed for predicting protein–nucleic acid binding sites, critical for characterizing the interactions between proteins and nucleic acids. Here, we present EquiPNAS, a new pLM-informed E(3) equivariant deep graph neural network framework for improved protein–nucleic acid binding site prediction. By combining the strengths of pLM and symmetry-aware deep graph learning, EquiPNAS consistently outperforms the state-of-the-art methods for both protein–DNA and protein–RNA binding site prediction on multiple datasets across a diverse set of predictive modeling scenarios ranging from using experimental input to AlphaFold2 predictions. Our ablation study reveals that the pLM embeddings used in EquiPNAS are sufficiently powerful to dramatically reduce the dependence on the availability of evolutionary information without compromising on accuracy, and that the symmetry-aware nature of the E(3) equivariant graph-based neural architecture offers remarkable robustness and performance resilience. EquiPNAS is freely available at https://github.com/Bhattacharya-Lab/EquiPNAS.

NSF-PAR ID:: 10487953

Author(s) / Creator(s):: Roche, Rahmatullah; Moussad, Bernard; Shuvo, Md Hossain; Tarafder, Sumit; Bhattacharya, Debswapna

Publisher / Repository:: Oxford University Press

Date Published:: 2024-01-28

Journal Name:: Nucleic Acids Research

Volume:: 52

Issue:: 5

ISSN:: 0305-1048

Format(s):: Medium: X Size: p. e27-e27

Size(s):: ["p. e27-e27"]

Sponsoring Org:: National Science Foundation

Journal Article:
https://doi.org/10.1093/nar/gkae039

More Like this