Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Abstract Deep learning approaches like AlphaFold 2 (AF2) have revolutionized structural biology by accurately predicting the ground state structures of proteins. Recently, clustering and subsampling techniques that manipulate multiple sequence alignment (MSA) inputs into AlphaFold to generate conformational ensembles of proteins have also been proposed. Although many of these techniques have been made open source, they often require integrating multiple packages and can be challenging for researchers who have a limited programming background to employ. This is especially true when researchers are interested in subsampling to produce predictions of protein conformational ensembles, which require multiple computational steps. This manuscript introduces FastConformation, a Python-based application that integrates MSA generation, structure prediction via AF2, and interactive analysis of protein conformations and their distributions, all in one place. FastConformation is accessible through a user-friendly GUI suitable for non-programmers, allowing users to iteratively refine subsampling parameters based on their analyses to achieve diverse conformational ensembles. Starting from an amino acid sequence, users can make protein conformation predictions and analyze results in just a few hours on their local machines, which is significantly faster than traditional molecular dynamics (MD) simulations. Uniquely, by leveraging the subsampling of MSAs, our tool enables the generation of alternative protein conformations. We demonstrate the utility of FastConformation on proteins including the Abl1 kinase, LAT1 transporter, and CCR5 receptor, showcasing its ability to predict and analyze the protein conformational ensembles and effects of mutations on a variety of proteins. This tool enables a wide range of high-throughput applications in protein biochemistry, drug discovery, and protein engineering.more » « lessFree, publicly-accessible full text available May 14, 2026
-
Abstract The intrinsic dynamics of most proteins are central to their function. Protein tyrosine kinases such as Abl1 undergo significant conformational changes that modulate their activity in response to different stimuli. These conformational changes constitute a conserved mechanism for self-regulation that dramatically impacts kinases’ affinities for inhibitors. Few studies have attempted to extensively sample the pathways and elucidate the mechanisms that underlie kinase inactivation. In large part, this is a consequence of the steep energy barriers associated with many kinase conformational changes, which present a significant obstacle for computational studies using traditional simulation methods. Seeking to bridge this knowledge gap, we present a thorough analysis of the “DFG flip” inactivation pathway in Abl1 kinase. By leveraging the power of the Weighted Ensemble methodology, which accelerates sampling without the use of biasing forces, we have comprehensively simulated DFG flip events in Abl1 and its inhibitor-resistant variants, revealing a rugged landscape punctuated by potentially druggable intermediate states. Through our strategy, we successfully simulated dozens of uncorrelated DFG flip events distributed along two principal pathways, identified the molecular mechanisms that govern them, and measured their relative probabilities. Further, we show that the compound Glu255Lys/Val Thr315Ile Abl1 variants owe their inhibitor resistance phenotype to an increase in the free energy barrier associated with completing the DFG flip. This barrier stabilizes Abl1 variants in conformations that can lead to loss of binding for Type-II inhibitors such as Imatinib or Ponatinib. Finally, we contrast our Abl1 observations with the relative state distributions and propensity for undergoing a DFG flip of evolutionarily-related protein tyrosine kinases with diverging Type-II inhibitor binding affinities. Altogether, we expect that our work will be of significant importance for protein tyrosine kinase inhibitor discovery, while also furthering our understanding of how enzymes self-regulate through highly-conserved molecular switches.more » « less
-
Descriptors are physically-inspired, symmetry-preserving schemes for representing atomistic systems that play a central role in the construction of models of potential energy surfaces. Although physical intuition can be flexibly encoded into descriptor schemes, they are generally ultimately guided only by the spatial or topological arrangement of atoms in the system. However, since interatomic potential models aim to capture the variation of the potential energy with respect to atomic configurations, it is conceivable that they would benefit from descriptor schemes that implicitly encode both structural and energetic information rather than structural information alone. Therefore, we propose a novel approach for the optimisation of descriptors based on encoding information about geodesic distances along potential energy manifolds into the hyperparameters of commonly used descriptor schemes. To accomplish this, we combine two ideas: (1) a differential-geometric approach for the fast estimation of approximate geodesic distances [Zhu et al., J. Chem. Phys. 150, 164103 (2019)]; and (2) an information-theoretic evaluation metric – information imbalance – for measuring the shared information between two distance measures [Glielmo et al. PNAS Nexus, 1, 1 (2022)]. Using three example molecules – ethanol, malonaldehyde, and aspirin – from the MD22 dataset, we first show that Euclidean (in Cartesian coordinates) and geodesic distances are inequivalent distance measures, indicating the need for updated ground-truth distance measures that go beyond the Euclidean (or, more broadly, spatial) distance. We then utilize a Bayesian optimisation framework to show that descriptors (in this case, atom-centred symmetry functions) can be optimized to maximally express a certain type of distance information, such as Euclidean or geodesic information. We also show that modifying the Bayesian optimisation algorithm to minimise a combined objective function – the sum of the descriptor↔Euclidean and descriptor↔geodesic information imbalances – can yield descriptors that not only optimally express both Euclidean and geodesic distance information simultaneously, but in fact resolve substantial disagreements between descriptors optimized to encode only one type of distance measure. We discuss the relevance of our approach to the design of more physically rich and informative descriptors that can encode useful, alternative information about molecular systems.more » « lessFree, publicly-accessible full text available February 16, 2026
An official website of the United States government
