skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Reconstructing Biological Molecules with Help from Video Gamers
Abstract Foldit is a citizen science video game in which players tackle a variety of complex biochemistry puzzles. Here, we describe a new series of puzzles in which Foldit players improve the accuracy of the public repository of experimental protein structure models, the Protein Data Bank (PDB). Analyzing the results of these puzzles showed that the Foldit players were able to considerably improve the deposited structures and thus, in most cases, improved the output of the automated PDB-REDO refinement pipeline. These improved structures are now being hosted at PDB-REDO. These efforts highlight the continued need for the engagement of the lay population in science.  more » « less
Award ID(s):
2051305
PAR ID:
10610913
Author(s) / Creator(s):
; ; ; ;
Publisher / Repository:
bioRxiv
Date Published:
Format(s):
Medium: X
Institution:
bioRxiv
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract The Protein Data Bank (PDB) archive is a rich source of information in the form of atomic‐level three‐dimensional (3D) structures of biomolecules experimentally determined using macromolecular crystallography, nuclear magnetic resonance (NMR) spectroscopy, and electron microscopy (3DEM). Originally established in 1971 as a resource for protein crystallographers to freely exchange data, today PDB data drive research and education across scientific disciplines. In 2011, the online portal PDB‐101 was launched to support teachers, students, and the general public in PDB archive exploration (pdb101.rcsb.org). Maintained by the Research Collaboratory for Structural Bioinformatics PDB, PDB‐101 aims to help train the next generation of PDB users and to promote the overall importance of structural biology and protein science to nonexperts. Regularly published features include the highly popularMolecule of the Monthseries, 3D model activities, molecular animation videos, and educational curricula. Materials are organized into various categories (Health and Disease, Molecules of Life, Biotech and Nanotech, and Structures and Structure Determination) and searchable by keyword. A biennial health focus frames new resource creation and provides topics for annual video challenges for high school students. Web analytics document that PDB‐101 materials relating to fundamental topics (e.g., hemoglobin, catalase) are highly accessed year‐on‐year. In addition, PDB‐101 materials created in response to topical health matters (e.g., Zika, measles, coronavirus) are well received. PDB‐101 shows how learning about the diverse shapes and functions of PDB structures promotes understanding of all aspects of biology, from the central dogma of biology to health and disease to biological energy. 
    more » « less
  2. Abstract The Protein Data Bank (PDB) archives 3D structures of macromolecules determined experimentally using various methods. It is jointly managed by the Worldwide Protein Data Bank (wwPDB) consortium. Research Collaboratory for Structural Bioinformatics (RCSB) PDB, the US data center for the PDB, provides streamlined access to >240 000 structures through a variety of research-focused tools on RCSB.org. In addition, RCSB.org makes available over 1 million computed structure models (CSMs) predicted using deep learning methods and archived in the AlphaFold Database and ModelArchive. The PDB-IHM system was developed as a wwPDB project based on community recommendations to archive structures determined using integrative/hybrid methods (IHM). These structures are computed by combining information from multiple experimental and computational techniques to overcome the limitations of traditional single methods (e.g. macromolecular crystallography, 3D electron microscopy, nuclear magnetic resonance spectroscopy). In 2024, PDB-IHM was unified with the PDB to archive integrative structures alongside single-method experimental structures. These integrative structures have been made accessible via the RCSB.org website, facilitating efficient delivery of IHM data to a broad community of PDB users. Herein, we describe the expanded capabilities of RCSB.org that support discovery, analysis, and visualization of integrative structures together with single-method experimental structures and CSMs. 
    more » « less
  3. The goal of this paper is predicting the conformational distributions of ligand binding sites using the AlphaFold2 (AF2) protein structure prediction program with stochastic subsampling of the multiple sequence alignment (MSA). We explored the opening of cryptic ligand binding sites in 16 proteins, where the closed and open conformations define the expected extreme points of the conformational variation. Due to the many structures of these proteins in the Protein Data Bank (PDB), we were able to study whether the distribution of X-ray structures affects the distribution of AF2 models. We have found that AF2 generates both a cluster of open and a cluster of closed models for proteins that have comparable numbers of open and closed structures in the PDB and not too many other conformations. This was observed even with default MSA parameters, thus without further subsampling. In contrast, with the exception of a single protein, AF2 did not yield multiple clusters of conformations for proteins that had imbalanced numbers of open and closed structures in the PDB, or had substantial numbers of other structures. Subsampling improved the results only for a single protein, but very shallow MSA led to incorrect structures. The ability of generating both open and closed conformations for six out of the 16 proteins agrees with the success rates of similar studies reported in the literature. However, we showed that this partial success is due to AF2 “remembering” the conformational distributions in the PDB and that the approach fails to predict rarely seen conformations. 
    more » « less
  4. Abstract For 20 years, Molecule of the Month articles have highlighted the functional stories of 3D structures found in the Protein Data Bank (PDB). The PDB is the primary archive of atomic structures of biological molecules, currently providing open access to more than 150,000 structures studied by researchers around the world. The wealth of knowledge embodied in this resource is remarkable, with structures that allow exploration of nearly any biomolecular topic, including the basic science of genetic mechanisms, mechanisms of photosynthesis and bioenergetics, and central biomedical topics like cancer therapy and the fight against infectious disease. The central motivation behind the Molecule of the Month is to provide a user‐friendly introduction to this rich body of data, charting a path for users to get started with finding and exploring the many available structures. The Molecule of the Month and related materials are updated regularly at the education portal PDB‐101 (http://pdb101.rcsb.org/), offering an ongoing resource for molecular biology educators and students around the world. 
    more » « less
  5. Abstract The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB), funded by the US National Science Foundation, National Institutes of Health, and Department of Energy, has served structural biologists and Protein Data Bank (PDB) data consumers worldwide since 1999. RCSB PDB, a founding member of the Worldwide Protein Data Bank (wwPDB) partnership, is the US data center for the global PDB archive housing biomolecular structure data. RCSB PDB is also responsible for the security of PDB data, as the wwPDB‐designated Archive Keeper. Annually, RCSB PDB serves tens of thousands of three‐dimensional (3D) macromolecular structure data depositors (using macromolecular crystallography, nuclear magnetic resonance spectroscopy, electron microscopy, and micro‐electron diffraction) from all inhabited continents. RCSB PDB makes PDB data available from its research‐focusedRCSB.orgweb portal at no charge and without usage restrictions to millions of PDB data consumers working in every nation and territory worldwide. In addition, RCSB PDB operates an outreach and educationPDB101.RCSB.orgweb portal that was used by more than 800,000 educators, students, and members of the public during calendar year 2020. This invited Tools Issue contribution describes (i) how the archive is growing and evolving as new experimental methods generate ever larger and more complex biomolecular structures; (ii) the importance of data standards and data remediation in effective management of the archive and facile integration with more than 50 external data resources; and (iii) new tools and features for 3D structure analysis and visualization made available during the past yearviatheRCSB.orgweb portal. 
    more » « less