skip to main content

Title: qFit 3: Protein and ligand multiconformer modeling for X‐ray crystallographic and single‐particle cryo‐EM density maps

New X‐ray crystallography and cryo‐electron microscopy (cryo‐EM) approaches yield vast amounts of structural data from dynamic proteins and their complexes. Modeling the full conformational ensemble can provide important biological insights, but identifying and modeling an internally consistent set of alternate conformations remains a formidable challenge. qFit efficiently automates this process by generating a parsimonious multiconformer model. We refactored qFit from a distributed application into software that runs efficiently on a small server, desktop, or laptop. We describe the new qFit 3 software and provide some examples. qFit 3 is open‐source under the MIT license, and is available at

more » « less
Author(s) / Creator(s):
 ;  ;  ;  ;  ;  ;  ;  
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
Protein Science
Page Range / eLocation ID:
p. 270-285
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Substantial progresses in protein structure prediction have been made by utilizing deep‐learning and residue‐residue distance prediction since CASP13. Inspired by the advances, we improve our CASP14 MULTICOM protein structure prediction system by incorporating three new components: (a) a new deep learning‐based protein inter‐residue distance predictor to improve template‐free (ab initio) tertiary structure prediction, (b) an enhanced template‐based tertiary structure prediction method, and (c) distance‐based model quality assessment methods empowered by deep learning. In the 2020 CASP14 experiment, MULTICOM predictor was ranked seventh out of 146 predictors in tertiary structure prediction and ranked third out of 136 predictors in inter‐domain structure prediction. The results demonstrate that the template‐free modeling based on deep learning and residue‐residue distance prediction can predict the correct topology for almost all template‐based modeling targets and a majority of hard targets (template‐free targets or targets whose templates cannot be recognized), which is a significant improvement over the CASP13 MULTICOM predictor. Moreover, the template‐free modeling performs better than the template‐based modeling on not only hard targets but also the targets that have homologous templates. The performance of the template‐free modeling largely depends on the accuracy of distance prediction closely related to the quality of multiple sequence alignments. The structural model quality assessment works well on targets for which enough good models can be predicted, but it may perform poorly when only a few good models are predicted for a hard target and the distribution of model quality scores is highly skewed. MULTICOM is available at

    more » « less
  2. Abstract

    We present a critical analysis of physics-informed neural operators (PINOs) to solve partial differential equations (PDEs) that are ubiquitous in the study and modeling of physics phenomena using carefully curated datasets. Further, we provide a benchmarking suite which can be used to evaluate PINOs in solving such problems. We first demonstrate that our methods reproduce the accuracy and performance of other neural operators published elsewhere in the literature to learn the 1D wave equation and the 1D Burgers equation. Thereafter, we apply our PINOs to learn new types of equations, including the 2D Burgers equation in the scalar, inviscid and vector types. Finally, we show that our approach is also applicable to learn the physics of the 2D linear and nonlinear shallow water equations, which involve three coupled PDEs. We release our artificial intelligence surrogates and scientific software to produce initial data and boundary conditions to study a broad range of physically motivated scenarios. We provide thesource code, an interactivewebsiteto visualize the predictions of our PINOs, and a tutorial for their use at theData and Learning Hub for Science.

    more » « less
  3. Mathematical models based on systems of ordinary differential equations (ODEs) are frequently applied in various scientific fields to assess hypotheses, estimate key model parameters, and generate predictions about the system's state. To support their application, we present a comprehensive, easy‐to‐use, and flexible MATLAB toolbox,QuantDiffForecast, and associated tutorial to estimate parameters and generate short‐term forecasts with quantified uncertainty from dynamical models based on systems of ODEs. We provide software ( and detailed guidance on estimating parameters and forecasting time‐series trajectories that are characterized using ODEs with quantified uncertainty through a parametric bootstrapping approach. It includes functions that allow the user to infer model parameters and assess forecasting performance for different ODE models specified by the user, using different estimation methods and error structures in the data. The tutorial is intended for a diverse audience, including students training in dynamic systems, and will be broadly applicable to estimate parameters and generate forecasts from models based on ODEs. The functions included in the toolbox are illustrated using epidemic models with varying levels of complexity applied to data from the 1918 influenza pandemic in San Francisco. A tutorial video that demonstrates the functionality of the toolbox is included.

    more » « less
  4. Abstract

    The Molecular Sciences Software Institute's (MolSSI) Quantum Chemistry Archive (QCArchive) project is an umbrella name that covers both a central server hosted by MolSSI for community data and the Python‐based software infrastructure that powers automated computation and storage of quantum chemistry (QC) results. The MolSSI‐hosted central server provides the computational molecular sciences community a location to freely access tens of millions of QC computations for machine learning, methodology assessment, force‐field fitting, and more through a Python interface. Facile, user‐friendly mining of the centrally archived quantum chemical data also can be achieved through web applications found at The software infrastructure can be used as a standalone platform to compute, structure, and distribute hundreds of millions of QC computations for individuals or groups of researchers at any scale. The QCArchiveInfrastructureis open‐source (BSD‐3C), code repositories can be found at, and releases can be downloaded via PyPI and Conda.

    This article is categorized under:

    Electronic Structure Theory > Ab Initio Electronic Structure Methods

    Software > Quantum Chemistry

    Data Science > Computer Algorithms and Programming

    more » « less
  5. Guillot, Gilles (Ed.)

    Diagnostic and prognostic models are increasingly important in medicine and inform many clinical decisions. Recently, machine learning approaches have shown improvement over conventional modeling techniques by better capturing complex interactions between patient covariates in a data-driven manner. However, the use of machine learning introduces technical and practical challenges that have thus far restricted widespread adoption of such techniques in clinical settings. To address these challenges and empower healthcare professionals, we present an open-source machine learning framework, AutoPrognosis 2.0, to facilitate the development of diagnostic and prognostic models. AutoPrognosis leverages state-of-the-art advances in automated machine learning to develop optimized machine learning pipelines, incorporates model explainability tools, and enables deployment of clinical demonstrators,withoutrequiring significant technical expertise. To demonstrate AutoPrognosis 2.0, we provide an illustrative application where we construct a prognostic risk score for diabetes using the UK Biobank, a prospective study of 502,467 individuals. The models produced by our automated framework achieve greater discrimination for diabetes than expert clinical risk scores. We have implemented our risk score as a web-based decision support tool, which can be publicly accessed by patients and clinicians. By open-sourcing our framework as a tool for the community, we aim to provide clinicians and other medical practitioners with an accessible resource to develop new risk scores, personalized diagnostics, and prognostics using machine learning techniques.


    more » « less