NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

FastConformation: A Standalone ML-Based Toolkit for Modeling and Analyzing Protein Conformational Ensembles at Scale

https://doi.org/10.1101/2025.05.09.653048

Galeazzi, Flavia Maria; Monteiro_da_Silva, Gabriel; Arantes, Pablo; Varghese, Iz; Shukla, Ananya; Rubenstein, Brenda M (May 2025, bioRxiv)

Abstract Deep learning approaches like AlphaFold 2 (AF2) have revolutionized structural biology by accurately predicting the ground state structures of proteins. Recently, clustering and subsampling techniques that manipulate multiple sequence alignment (MSA) inputs into AlphaFold to generate conformational ensembles of proteins have also been proposed. Although many of these techniques have been made open source, they often require integrating multiple packages and can be challenging for researchers who have a limited programming background to employ. This is especially true when researchers are interested in subsampling to produce predictions of protein conformational ensembles, which require multiple computational steps. This manuscript introduces FastConformation, a Python-based application that integrates MSA generation, structure prediction via AF2, and interactive analysis of protein conformations and their distributions, all in one place. FastConformation is accessible through a user-friendly GUI suitable for non-programmers, allowing users to iteratively refine subsampling parameters based on their analyses to achieve diverse conformational ensembles. Starting from an amino acid sequence, users can make protein conformation predictions and analyze results in just a few hours on their local machines, which is significantly faster than traditional molecular dynamics (MD) simulations. Uniquely, by leveraging the subsampling of MSAs, our tool enables the generation of alternative protein conformations. We demonstrate the utility of FastConformation on proteins including the Abl1 kinase, LAT1 transporter, and CCR5 receptor, showcasing its ability to predict and analyze the protein conformational ensembles and effects of mutations on a variety of proteins. This tool enables a wide range of high-throughput applications in protein biochemistry, drug discovery, and protein engineering.
more » « less
Free, publicly-accessible full text available May 14, 2026
Compound Mutations in the Abl1 Kinase Cause Inhibitor Resistance by Shifting DFG Flip Mechanisms and Relative State Populations

https://doi.org/10.1101/2024.05.23.595569

Monteiro_da_Silva, Gabriel; Lam, Kyle; Dalgarno, David C; Rubenstein, Brenda M (May 2024, bioRxiv)

Abstract The intrinsic dynamics of most proteins are central to their function. Protein tyrosine kinases such as Abl1 undergo significant conformational changes that modulate their activity in response to different stimuli. These conformational changes constitute a conserved mechanism for self-regulation that dramatically impacts kinases’ affinities for inhibitors. Few studies have attempted to extensively sample the pathways and elucidate the mechanisms that underlie kinase inactivation. In large part, this is a consequence of the steep energy barriers associated with many kinase conformational changes, which present a significant obstacle for computational studies using traditional simulation methods. Seeking to bridge this knowledge gap, we present a thorough analysis of the “DFG flip” inactivation pathway in Abl1 kinase. By leveraging the power of the Weighted Ensemble methodology, which accelerates sampling without the use of biasing forces, we have comprehensively simulated DFG flip events in Abl1 and its inhibitor-resistant variants, revealing a rugged landscape punctuated by potentially druggable intermediate states. Through our strategy, we successfully simulated dozens of uncorrelated DFG flip events distributed along two principal pathways, identified the molecular mechanisms that govern them, and measured their relative probabilities. Further, we show that the compound Glu255Lys/Val Thr315Ile Abl1 variants owe their inhibitor resistance phenotype to an increase in the free energy barrier associated with completing the DFG flip. This barrier stabilizes Abl1 variants in conformations that can lead to loss of binding for Type-II inhibitors such as Imatinib or Ponatinib. Finally, we contrast our Abl1 observations with the relative state distributions and propensity for undergoing a DFG flip of evolutionarily-related protein tyrosine kinases with diverging Type-II inhibitor binding affinities. Altogether, we expect that our work will be of significant importance for protein tyrosine kinase inhibitor discovery, while also furthering our understanding of how enzymes self-regulate through highly-conserved molecular switches.
more » « less
Full Text Available
Atomistic descriptor optimization using complementary Euclidean and geodesic distance information

https://doi.org/10.1080/00268976.2024.2381617

Iyer, Gopal R; Rubenstein, Brenda M (February 2025, Molecular Physics)

Descriptors are physically-inspired, symmetry-preserving schemes for representing atomistic systems that play a central role in the construction of models of potential energy surfaces. Although physical intuition can be flexibly encoded into descriptor schemes, they are generally ultimately guided only by the spatial or topological arrangement of atoms in the system. However, since interatomic potential models aim to capture the variation of the potential energy with respect to atomic configurations, it is conceivable that they would benefit from descriptor schemes that implicitly encode both structural and energetic information rather than structural information alone. Therefore, we propose a novel approach for the optimisation of descriptors based on encoding information about geodesic distances along potential energy manifolds into the hyperparameters of commonly used descriptor schemes. To accomplish this, we combine two ideas: (1) a differential-geometric approach for the fast estimation of approximate geodesic distances [Zhu et al., J. Chem. Phys. 150, 164103 (2019)]; and (2) an information-theoretic evaluation metric – information imbalance – for measuring the shared information between two distance measures [Glielmo et al. PNAS Nexus, 1, 1 (2022)]. Using three example molecules – ethanol, malonaldehyde, and aspirin – from the MD22 dataset, we first show that Euclidean (in Cartesian coordinates) and geodesic distances are inequivalent distance measures, indicating the need for updated ground-truth distance measures that go beyond the Euclidean (or, more broadly, spatial) distance. We then utilize a Bayesian optimisation framework to show that descriptors (in this case, atom-centred symmetry functions) can be optimized to maximally express a certain type of distance information, such as Euclidean or geodesic information. We also show that modifying the Bayesian optimisation algorithm to minimise a combined objective function – the sum of the descriptor↔Euclidean and descriptor↔geodesic information imbalances – can yield descriptors that not only optimally express both Euclidean and geodesic distance information simultaneously, but in fact resolve substantial disagreements between descriptors optimized to encode only one type of distance measure. We discuss the relevance of our approach to the design of more physically rich and informative descriptors that can encode useful, alternative information about molecular systems.
more » « less
Free, publicly-accessible full text available February 16, 2026
Gaussian processes for finite size extrapolation of many-body simulations

https://doi.org/10.1039/D4FD00051J

Landinez_Borda, Edgar Josué; Berard, Kenneth O; Lopez, Annette; Rubenstein, Brenda (November 2024, Faraday Discussions)

We employ Gaussian processes to more accurately and efficiently extrapolate many-body simulations to their thermodynamic limit.
more » « less
Full Text Available
Disentangling the physics of the attractive Hubbard model as a fully interacting model of fermions via the accessible and symmetry-resolved entanglement entropies

https://doi.org/10.1103/PhysRevB.109.195119

Shen, Tong; Barghathi, Hatem; Del_Maestro, Adrian; Rubenstein, Brenda_M (May 2024, Physical Review B)
VMC optimization of ultra-compact, explicitly-correlated wave functions of the Li isoelectronic sequence in its lowest 1s2s2p quartet state

https://doi.org/10.1016/j.cplett.2024.141091

Nader, DJ; Rubenstein, BM (March 2024, Chemical Physics Letters)

Full Text Available
Stable recursive auxiliary field quantum Monte Carlo algorithm in the canonical ensemble: Applications to thermometry and the Hubbard model

https://doi.org/10.1103/PhysRevE.107.055302

Shen, Tong; Barghathi, Hatem; Yu, Jiangyong; Del Maestro, Adrian; Rubenstein, Brenda M. (May 2023, Physical Review E)

Full Text Available
Finite temperature auxiliary field quantum Monte Carlo in the canonical ensemble

https://doi.org/10.1063/5.0026606

Shen, Tong; Liu, Yuan; Yu, Yang; Rubenstein, Brenda M. (November 2020, The Journal of Chemical Physics)

Full Text Available
Unveiling the Finite Temperature Physics of Hydrogen Chains via Auxiliary Field Quantum Monte Carlo

https://doi.org/10.1021/acs.jctc.0c00288

Liu, Yuan; Shen, Tong; Zhang, Hang; Rubenstein, Brenda (July 2020, Journal of Chemical Theory and Computation)

Full Text Available

Search for: All records