Stability Oracle: a structure-based graph-transformer framework for identifying stabilizing mutations

Diaz, Daniel J; Gong, Chengyue; Ouyang-Zhang, Jeffrey; Loy, James M; Wells, Jordan; Yang, David; Ellington, Andrew D; Dimakis, Alexandros G; Klivans, Adam R

doi:10.1038/s41467-024-49780-2

Citation Details

This content will become publicly available on December 1, 2025

Stability Oracle: a structure-based graph-transformer framework for identifying stabilizing mutations

Abstract Engineering stabilized proteins is a fundamental challenge in the development of industrial and pharmaceutical biotechnologies. We present Stability Oracle: a structure-based graph-transformer framework that achieves SOTA performance on accurately identifying thermodynamically stabilizing mutations. Our framework introduces several innovations to overcome well-known challenges in data scarcity and bias, generalization, and computation time, such as: Thermodynamic Permutations for data augmentation, structural amino acid embeddings to model a mutation with a single structure, a protein structure-specific attention-bias mechanism that makes transformers a viable alternative to graph neural networks. We provide training/test splits that mitigate data leakage and ensure proper model evaluation. Furthermore, to examine our data engineering contributions, we fine-tune ESM2 representations (Prostata-IFML) and achieve SOTA for sequence-based models. Notably, Stability Oracle outperforms Prostata-IFML even though it was pretrained on 2000X less proteins and has 548X less parameters. Our framework establishes a path for fine-tuning structure-based transformers to virtually any phenotype, a necessary task for accelerating the development of protein-based biotechnologies. more »

Award ID(s):: 2505865 2019844

PAR ID:: 10631108

Author(s) / Creator(s):: Diaz, Daniel J; Gong, Chengyue; Ouyang-Zhang, Jeffrey; Loy, James M; Wells, Jordan; Yang, David; Ellington, Andrew D; Dimakis, Alexandros G; Klivans, Adam R

Publisher / Repository:: Nature Communications

Date Published:: 2024-12-01

Journal Name:: Nature Communications

Volume:: 15

Issue:: 1

ISSN:: 2041-1723

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
This content will become publicly available on December 1, 2025
Journal Article:
https://doi.org/10.1038/s41467-024-49780-2

More Like this