skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Archiving and disseminating integrative structure models
Limitations in the applicability, accuracy, and precision of individual structure characterization methods can sometimes be overcome via an integrative modeling approach that relies on information from all available sources, including all available experimental data and prior models. The open-source Integrative Modeling Platform (IMP) is one piece of software that implements all computational aspects of integrative modeling. To maximize the impact of integrative structures, the coordinates should be made publicly available, as is already the case for structures based on X-ray crystallography, NMR spectroscopy, and electron microscopy. Moreover, the associated experimental data and modeling protocols should also be archived, such that the original results can easily be reproduced. Finally, it is essential that the integrative structures are validated as part of their publication and deposition. A number of research groups have already developed software to implement integrative modeling and have generated a number of structures, prompting the formation of an Integrative/Hybrid Methods Task Force. Following the recommendations of this task force, the existing PDBx/mmCIF data representation used for atomic PDB structures has been extended to address the requirements for archiving integrative structural models. This IHM-dictionary adds a flexible model representation, including coarse graining, models in multiple states and/or related by time or other order, and multiple input experimental information sources. A prototype archiving system called PDB-Dev ( https://pdb-dev.wwpdb.org ) has also been created to archive integrative structural models, together with a Python library to facilitate handling of integrative models in PDBx/mmCIF format.  more » « less
Award ID(s):
1756248
PAR ID:
10108385
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
Journal of Biomolecular NMR
ISSN:
0925-2738
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Structures of many complex biological assemblies are increasingly determined using integrative approaches, in which data from multiple experimental methods are combined. A standalone system, called PDB-Dev, has been developed for archiving integrative structures and making them publicly available. Here, the data standards and software tools that support PDB-Dev are described along with the new and updated components of the PDB-Dev data-collection, processing and archiving infrastructure. Following the FAIR (Findable, Accessible, Interoperable and Reusable) principles, PDB-Dev ensures that the results of integrative structure determinations are freely accessible to everyone. 
    more » « less
  2. null (Ed.)
    Abstract FRET experiments can provide state-specific structural information of complex dynamic biomolecular assemblies. However, to overcome the sparsity of FRET experiments, they need to be combined with computer simulations. We introduce a program suite with ( i ) an automated design tool for FRET experiments, which determines how many and which FRET pairs should be used to minimize the uncertainty and maximize the accuracy of an integrative structure, ( ii ) an efficient approach for FRET-assisted coarse-grained structural modeling, and all-atom molecular dynamics simulations-based refinement, and ( iii ) a quantitative quality estimate for judging the accuracy of FRET-derived structures as opposed to precision. We benchmark our tools against simulated and experimental data of proteins with multiple conformational states and demonstrate an accuracy of ~3 Å RMSD Cα against X-ray structures for sets of 15 to 23 FRET pairs. Free and open-source software for the introduced workflow is available at https://github.com/Fluorescence-Tools . A web server for FRET-assisted structural modeling of proteins is available at http://nmsim.de . 
    more » « less
  3. null (Ed.)
    Single-molecule FRET (smFRET) has become a mainstream technique for studying biomolecular structural dynamics. The rapid and wide adoption of smFRET experiments by an ever-increasing number of groups has generated significant progress in sample preparation, measurement procedures, data analysis, algorithms and documentation. Several labs that employ smFRET approaches have joined forces to inform the smFRET community about streamlining how to perform experiments and analyze results for obtaining quantitative information on biomolecular structure and dynamics. The recent efforts include blind tests to assess the accuracy and the precision of smFRET experiments among different labs using various procedures. These multi-lab studies have led to the development of smFRET procedures and documentation, which are important when submitting entries into the archiving system for integrative structure models, PDB-Dev. This position paper describes the current ‘state of the art’ from different perspectives, points to unresolved methodological issues for quantitative structural studies, provides a set of ‘soft recommendations’ about which an emerging consensus exists, and lists openly available resources for newcomers and seasoned practitioners. To make further progress, we strongly encourage ‘open science’ practices. 
    more » « less
  4. Abstract Since 1971, the Protein Data Bank (PDB) has served as the single global archive for experimentally determined 3D structures of biological macromolecules made freely available to the global community according to the FAIR principles of Findability–Accessibility–Interoperability–Reusability. During the first 50 years of continuous PDB operations, standards for data representation have evolved to better represent rich and complex biological phenomena. Carbohydrate molecules present in more than 14,000 PDB structures have recently been reviewed and remediated to conform to a new standardized format. This machine-readable data representation for carbohydrates occurring in the PDB structures and the corresponding reference data improves the findability, accessibility, interoperability and reusability of structural information pertaining to these molecules. The PDB Exchange MacroMolecular Crystallographic Information File data dictionary now supports (i) standardized atom nomenclature that conforms to International Union of Pure and Applied Chemistry-International Union of Biochemistry and Molecular Biology (IUPAC-IUBMB) recommendations for carbohydrates, (ii) uniform representation of branched entities for oligosaccharides, (iii) commonly used linear descriptors of carbohydrates developed by the glycoscience community and (iv) annotation of glycosylation sites in proteins. For the first time, carbohydrates in PDB structures are consistently represented as collections of standardized monosaccharides, which precisely describe oligosaccharide structures and enable improved carbohydrate visualization, structure validation, robust quantitative and qualitative analyses, search for dendritic structures and classification. The uniform representation of carbohydrate molecules in the PDB described herein will facilitate broader usage of the resource by the glycoscience community and researchers studying glycoproteins. 
    more » « less
  5. Recent advances in Artificial Intelligence and Machine Learning (e.g., AlphaFold, RosettaFold, and ESMFold) enable prediction of three-dimensional (3D) protein structures from amino acid sequences alone at accuracies comparable to lower-resolution experimental methods. These tools have been employed to predict structures across entire proteomes and the results of large-scale metagenomic sequence studies, yielding an exponential increase in available biomolecular 3D structural information. Given the enormous volume of this newly computed biostructure data, there is an urgent need for robust tools to manage, search, cluster, and visualize large collections of structures. Equally important is the capability to efficiently summarize and visualize metadata, biological/biochemical annotations, and structural features, particularly when working with vast numbers of protein structures of both experimental origin from the Protein Data Bank (PDB) and computationally-predicted models. Moreover, researchers require advanced visualization techniques that support interactive exploration of multiple sequences and structural alignments. This paper introduces a suite of tools provided on the RCSB PDB research-focused web portal RCSB. org, tailor-made for efficient management, search, organization, and visualization of this burgeoning corpus of 3D macromolecular structure data. 
    more » « less