skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Award ID contains: 2137630

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. The recent cryoEM resolution revolution has had a tremendous impact on our ability to investigate biomolecular structure and function. However, outstanding questions about the reliability of using a cryoEM-derived molecular model for interpreting experiments and building further hypotheses limit its full impact. Significant amounts of research have been focused on developing metrics to assess cryoEM model quality, yet no consensus exists. This is in part because the meaning of cryoEM model quality is not well defined. In this work, we formalize cryoEM model quality in terms of whether a cryoEM map is better described by a model with localized atomic coordinates or by a lower-resolution model that lacks atomic-level information. This approach emerges from a novel, quantitative definition of image resolution based upon the hierarchical structure of biomolecules, which enables computational selection of the length scale to which a biomolecule is resolved based upon the available evidence embedded in the experimental data. In the context of cryoEM, we develop a machine learning-based implementation of this framework, called hierarchical atomic resolution perception (HARP), for assessing local atomic resolution in a cryoEM map and thus evaluating cryoEM model quality in a theoretically and statistically well-defined manner. Finally, using HARP, we perform a meta-analysis of the cryoEM-derived structures in the Protein Data Bank (PDB) to assess the state of atomic resolution in the field and quantify factors that affect it. 
    more » « less
  2. This repository contains the results of a hierarchical atomic resolution perception (HARP) calculation on each of the cryoEM structures deposited in the Protein Data Bank (PDB) prior to January 1, 2023. Top-level group names are the PDB IDs of the structures. HDF5 group attributes for each entry are certain metadata extracted from the mmCIF files associated with each entry. HDF5 datasets within each group are indexed relative to each other (i.e., are of the same length). 
    more » « less
  3. A critical step in data analysis for many different types of experiments is the identification of features with theoretically defined shapes in N -dimensional datasets; examples of this process include finding peaks in multi-dimensional molecular spectra or emitters in fluorescence microscopy images. Identifying such features involves determining if the overall shape of the data is consistent with an expected shape; however, it is generally unclear how to quantitatively make this determination. In practice, many analysis methods employ subjective, heuristic approaches, which complicates the validation of any ensuing results—especially as the amount and dimensionality of the data increase. Here, we present a probabilistic solution to this problem by using Bayes’ rule to calculate the probability that the data have any one of several potential shapes. This probabilistic approach may be used to objectively compare how well different theories describe a dataset, identify changes between datasets and detect features within data using a corollary method called Bayesian Inference-based Template Search; several proof-of-principle examples are provided. Altogether, this mathematical framework serves as an automated ‘engine’ capable of computationally executing analysis decisions currently made by visual inspection across the sciences. 
    more » « less