NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

MULocDeep web service for protein localization prediction and visualization at subcellular and suborganellar levels

https://doi.org/10.1093/nar/gkad374

Jiang, Yuexu; Jiang, Lei; Akhil, Chopparapu Sai; Wang, Duolin; Zhang, Ziyang; Zhang, Weinan; Xu, Dong (May 2023, Nucleic Acids Research)

Abstract Predicting protein localization and understanding its mechanisms are critical in biology and pathology. In this context, we propose a new web application of MULocDeep with improved performance, result interpretation, and visualization. By transferring the original model into species-specific models, MULocDeep achieved competitive prediction performance at the subcellular level against other state-of-the-art methods. It uniquely provides a comprehensive localization prediction at the suborganellar level. Besides prediction, our web service quantifies the contribution of single amino acids to localization for individual proteins; for a group of proteins, common motifs or potential targeting-related regions can be derived. Furthermore, the visualizations of targeting mechanism analyses can be downloaded for publication-ready figures. The MULocDeep web service is available at https://www.mu-loc.org/.
more » « less
Parameter-efficient fine-tuning on large protein language models improves signal peptide prediction

https://doi.org/10.1101/gr.279132.124

Zeng, Shuai; Wang, Duolin; Jiang, Lei; Xu, Dong (September 2024, Genome Research)

Signal peptides (SPs) play a crucial role in protein translocation in cells. The development of large protein language models (PLMs) and prompt-based learning provide a new opportunity for SP prediction, especially for the categories with limited annotated data. We present a parameter-efficient fine-tuning (PEFT) framework for SP prediction, PEFT-SP, to effectively utilize pretrained PLMs. We integrated low-rank adaptation (LoRA) into ESM-2 models to better leverage the protein sequence evolutionary knowledge of PLMs. Experiments show that PEFT-SP using LoRA enhances state-of-the-art results, leading to a maximum Matthews correlation coefficient (MCC) gain of 87.3% for SPs with small training samples and an overall MCC gain of 6.1%. Furthermore, we also employed two other PEFT methods, prompt tuning and adapter tuning, in ESM-2 for SP prediction. More elaborate experiments show that PEFT-SP using adapter tuning can also improve the state-of-the-art results by up to 28.1% MCC gain for SPs with small training samples and an overall MCC gain of 3.8%. LoRA requires fewer computing resources and less memory than the adapter tuning during the training stage, making it possible to adapt larger and more powerful protein models for SP prediction.
more » « less
Full Text Available
Prompt-Based Learning on Large Protein Language Models Improves Signal Peptide Prediction

Zeng, Shuai; Wang, Duolin; Jiang, Lei; Xu, Dong (May 2024, Springer Nature Switzerland)

Full Text Available

Search for: All records