Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
ABSTRACT Circulating antibodies in patients with cancer can facilitate the identification of accessible epitopes on autoantigens expressed by tumors. To identify previously unrecognized protein targets in ovarian cancer, we computationally assessed a heptapeptide consensus motif (VPELGHE, flanked by two cysteine residues yielding a cyclic nonapeptide under oxidizing conditions) previously discovered via phage display‐based epitope mapping of autoantibodies in patients. Eight proteins associated with ovarian cancer encompass amino acid sequences similar to the consensus motif and were, therefore, considered as candidate native autoantigens. Among these candidate targets, however, matrix metalloproteinase 14 (MMP14) demonstrates gene expression that is both high and negatively correlated with survival in ovarian cancer patient cohorts. MMP14 protein levels are also stable in tumor versus non‐tumor tissues. Moreover, the corresponding heptapeptide mimic in MMP14 occurs within an α‐helical secondary structural element observed in its catalytic domain. These findings demonstrate that a subset of patient‐derived autoantibodies may interact with a previously unknown antigenic epitope found in MMP14 and other MMPs, thereby providing opportunities for the development of new targeted agents.more » « less
-
Abstract Data visualization is a pivotal component of a structural biologist’s arsenal. The Mol* Viewer makes molecular visualizations available to broader audiences via most web browsers. While Mol* provides a wide range of functionality, it has a steep learning curve and is only available via a JavaScript interface. To enhance the accessibility and usability of web-based molecular visualization, we introduce MolViewSpec (molstar.org/mol-view-spec), a standardized approach for defining molecular visualizations that decouples the definition of complex molecular scenes from their rendering. Scene definition can include references to commonly used structural, volumetric, and annotation data formats together with a description of how the data should be visualized and paired with optional annotations specifying colors, labels, measurements, and custom 3D geometries. Developed as an open standard, this solution paves the way for broader interoperability and support across different programming languages and molecular viewers, enabling more streamlined, standardized, and reproducible visual molecular analyses. MolViewSpec is freely available as a Mol* extension and a standalone Python package.more » « less
-
Abstract BindingDB (bindingdb.org) is a public, web-accessible database of experimentally measured binding affinities between small molecules and proteins, which supports diverse applications including medicinal chemistry, biochemical pathway annotation, training of artificial intelligence models and computational chemistry methods development. This update reports significant growth and enhancements since our last review in 2016. Of note, the database now contains 2.9 million binding measurements spanning 1.3 million compounds and thousands of protein targets. This growth is largely attributable to our unique focus on curating data from US patents, which has yielded a substantial influx of novel binding data. Recent improvements include a remake of the website following responsive web design principles, enhanced search and filtering capabilities, new data download options and webservices and establishment of a long-term data archive replicated across dispersed sites. We also discuss BindingDB’s positioning relative to related resources, its open data sharing policies, insights gleaned from the dataset and plans for future growth and development.more » « less
-
Abstract This review article describes the co-evolution of structural biology as a discipline and the Protein Data Bank (PDB), established in 1971 as the first open-access data resource in biology by like-minded structural scientists. As the PDB archive grew in size and scope to encompass macromolecular crystallography, NMR spectroscopy, and cryo-electron microscopy, new technologies were developed to ingest, validate, curate, store, and distribute the information. Community engagement ensured that the needs of structural biologists (data depositors) and data consumers were met. Today, the archive houses more than 230,000 experimentally determined structures of proteins, nucleic acids, and macromolecular machines and their complexes with one another and small-molecule ligands. Aggregate costs of PDB data preservation are ~1% of the cost of structure determination. The enormous impact of PDB data on basic and applied research and education across the natural and medical sciences is presented and highlighted with illustrative examples. Enablement ofde novoprotein structure prediction (AlphaFold2, RoseTTAfold, OpenFold,etc.) is the most widely appreciated benefit of having a corpus of rigorously validated, expertly curated 3D biostructure data.more » « less
-
Abstract The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB, RCSB.org), the US Worldwide Protein Data Bank (wwPDB, wwPDB.org) data center for the global PDB archive, provides access to the PDB data via its RCSB.org research-focused web portal. We report substantial additions to the tools and visualization features available at RCSB.org, which now delivers more than 227000 experimentally determined atomic-level three-dimensional (3D) biostructures stored in the global PDB archive alongside more than 1 million Computed Structure Models (CSMs) of proteins (including models for human, model organisms, select human pathogens, crop plants and organisms important for addressing climate change). In addition to providing support for 3D structure motif searches with user-provided coordinates, new features highlighted herein include query results organized by redundancy-reduced Groups and summary pages that facilitate exploration of groups of similar proteins. Newly released programmatic tools are also described, as are enhanced training opportunities.more » « less
-
Abstract The easiest and often most useful way to work with experimentally determined or computationally predicted structures of biomolecules is by viewing their three‐dimensional (3D) shapes using a molecular visualization tool. Mol* was collaboratively developed by RCSB Protein Data Bank (RCSB PDB,RCSB.org) and Protein Data Bank in Europe (PDBe,PDBe.org) as an open‐source, web‐based, 3D visualization software suite for examination and analyses of biostructures. It is capable of displaying atomic coordinates and related experimental data of biomolecular structures together with a variety of annotations, facilitating basic and applied research, training, education, and information dissemination. AcrossRCSB.org, the RCSB PDB research‐focused web portal, Mol* has been implemented to support single‐mouse‐click atomic‐level visualization of biomolecules (e.g., proteins, nucleic acids, carbohydrates) with bound cofactors, small‐molecule ligands, ions, water molecules, or other macromolecules.RCSB.orgMol* can seamlessly display 3D structures from various sources, allowing structure interrogation, superimposition, and comparison. Using influenza A H5N1 virus as a topical case study of an important pathogen, we exemplify how Mol* has been embedded within variousRCSB.orgtools—allowing users to view polymer sequence and structure‐based annotations integrated from trusted bioinformatics data resources, assess patterns and trends in groups of structures, and view structures of any size and compositional complexity. In addition to being linked to every experimentally determined biostructure and Computed Structure Model made available atRCSB.org, Standalone Mol* is freely available for visualizing any atomic‐level or multi‐scale biostructure atrcsb.org/3d-view.more » « less
-
Structures of many large biomolecular assemblies are now being determined using integrative approaches. In these approaches, information derived from multiple experimental and computational methods is combined to compute three-dimensional structures of multi-protein complexes and other macromolecular machines. A standalone prototype data resource for integrative structures called PDB-Dev was built, based on recommendations of the Integrative and Hybrid Methods (IHM) Task Force of the Worldwide Protein Data Bank (wwPDB). This effort included developing data standards and software tools for collecting, curating, validating, visualizing, archiving, and disseminating integrative structures that span diverse spatiotemporal scales and conformational states. Mechanisms have been created to validate integrative structures based on the experimental data underpinning them. Building upon this foundational framework, PDB-Dev has been further expanded to handle large dynamic macromolecular systems and integrative structures that combine, for example, experimental restraints with atomic coordinates computed by machine learning algorithms. Data standards and supporting tools have also been extended to capture information about biomolecular dynamics, such as conformational transitions and related kinetic data derived from biophysical methods. Recently, PDB-Dev was unified with the PDB archive and rebranded as PDB-IHM (pdb-ihm.org), further promoting FAIR (Findable, Accessible, Interoperable, and Reusable) principles of data stewardship for integrative structural biology.more » « less
-
Atomic coordinate models are important for the interpretation of 3D maps produced with cryoEM and cryoET (3D electron microscopy; 3DEM). In addition to visual inspection of such maps and models, quantitative metrics can inform about the reliability of the atomic coordinates, in particular how well the model is supported by the experimentally determined 3DEM map. A recently introduced metric,Q-score, was shown to correlate well with the reported resolution of the map for well fitted models. Here, we present new statistical analyses ofQ-score based on its application to ∼10 000 maps and models archived in the EMDB (Electron Microscopy Data Bank) and PDB (Protein Data Bank). Further, we introduce two new metrics based onQ-score to represent each map and model relative to all entries in the EMDB and those with similar resolution. We explore through illustrative examples of proteins, nucleic acids and small molecules howQ-scores can indicate whether the atomic coordinates are well fitted to 3DEM maps and also whether some parts of a map may be poorly resolved due to factors such as molecular flexibility, radiation damage and/or conformational heterogeneity. These examples and statistical analyses provide a basis for howQ-scores can be interpreted effectively in order to evaluate 3DEM maps and atomic coordinate models prior to publication and archiving.more » « less
-
The 2024 Nobel Prize in Chemistry was awarded in part for de novo protein structure prediction using AlphaFold2, an artificial intelligence/machine learning (AI/ML) model trained on vast amounts of sequence and three-dimensional structure data. AlphaFold2 and related models, including RoseTTAFold and ESMFold, employ specialized neural network architectures driven by attention mechanisms to infer relationships between sequence and structure. At a fundamental level, these AI/ML models operate on the long-standing hypothesis that the structure of a protein is determined by its amino acid sequence. More recently, AlphaFold2 has been adapted for the prediction of multiple protein conformations by subsampling multiple sequence alignments. Herein, we provide an overview of the deterministic relationship between sequence and structure, which was hypothesized over half a century ago with profound implications for the biological sciences ever since. We postulate that protein conformational dynamics are also determined, at least in part, by amino acid sequence and that this relationship may be leveraged for construction of AI/ML models dedicated to predicting protein conformational ensembles. Accordingly, we describe a conceptual model architecture, which may be trained on sequence data in combination with conformationally sensitive structural information, coming primarily from nuclear magnetic resonance (NMR) spectroscopy. Notwithstanding certain limitations in this context, NMR offers abundant structural heterogeneity conducive to conformational ensemble prediction. As NMR and other data continue to accumulate, sequence-informed prediction of protein structural dynamics with AI/ML has the potential to emerge as a transformative capability across the biological sciences.more » « less
An official website of the United States government
