FastConformation: A Standalone ML-Based Toolkit for Modeling and Analyzing Protein Conformational Ensembles at Scale

Galeazzi, Flavia Maria; Monteiro_da_Silva, Gabriel; Arantes, Pablo; Varghese, Iz; Shukla, Ananya; Rubenstein, Brenda M

doi:10.1101/2025.05.09.653048

Abstract Deep learning approaches like AlphaFold 2 (AF2) have revolutionized structural biology by accurately predicting the ground state structures of proteins. Recently, clustering and subsampling techniques that manipulate multiple sequence alignment (MSA) inputs into AlphaFold to generate conformational ensembles of proteins have also been proposed. Although many of these techniques have been made open source, they often require integrating multiple packages and can be challenging for researchers who have a limited programming background to employ. This is especially true when researchers are interested in subsampling to produce predictions of protein conformational ensembles, which require multiple computational steps. This manuscript introduces FastConformation, a Python-based application that integrates MSA generation, structure prediction via AF2, and interactive analysis of protein conformations and their distributions, all in one place. FastConformation is accessible through a user-friendly GUI suitable for non-programmers, allowing users to iteratively refine subsampling parameters based on their analyses to achieve diverse conformational ensembles. Starting from an amino acid sequence, users can make protein conformation predictions and analyze results in just a few hours on their local machines, which is significantly faster than traditional molecular dynamics (MD) simulations. Uniquely, by leveraging the subsampling of MSAs, our tool enables the generation of alternative protein conformations. We demonstrate the utility of FastConformation on proteins including the Abl1 kinase, LAT1 transporter, and CCR5 receptor, showcasing its ability to predict and analyze the protein conformational ensembles and effects of mutations on a variety of proteins. This tool enables a wide range of high-throughput applications in protein biochemistry, drug discovery, and protein engineering.

More Like this