Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Abstract The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB), founding member of the Worldwide Protein Data Bank (wwPDB), is the US data center for the open-access PDB archive. As wwPDB-designated Archive Keeper, RCSB PDB is also responsible for PDB data security. Annually, RCSB PDB serves >10 000 depositors of three-dimensional (3D) biostructures working on all permanently inhabited continents. RCSB PDB delivers data from its research-focused RCSB.org web portal to many millions of PDB data consumers based in virtually every United Nations-recognized country, territory, etc. This Database Issue contribution describes upgrades to the research-focused RCSB.org web portal that created a one-stop-shop for open access to ∼200 000 experimentally-determined PDB structures of biological macromolecules alongside >1 000 000 incorporated Computed Structure Models (CSMs) predicted using artificial intelligence/machine learning methods. RCSB.org is a ‘living data resource.’ Every PDB structure and CSM is integrated weekly with related functional annotations from external biodata resources, providing up-to-date information for the entire corpus of 3D biostructure data freely available from RCSB.org with no usage limitations. Within RCSB.org, PDB structures and the CSMs are clearly identified as to their provenance and reliability. Both are fully searchable, and can be analyzed and visualized using the full complement of RCSB.org web portal capabilities.more » « less
-
The Protein Data Bank (PDB) holds an extensive amount of information, and can be a vital tool when performing background research for biochemical work. In an attempt to make the information in the PDB more accessible, the RCSB Search API was employed within Jupyter Notebooks to create more customizable and user-friendly tools with Python code. Areas of focus include searches targeting ligands with specific characteristics, searches for FDA Approved Drugs, as well as sequence searches, used to search for entries based on different sequence characteristics. This code has been built into Jupyter Notebook templates that include examples of these searches as well as annotated code that users can customize to more efficiently run advanced searches on the PDB and download structure and small molecule files returned by the search. These notebooks also walk users through different ways to organize or utilize the returns from advanced searches. Future plans include increasing the amount and type of information available from a search, improved ease of access for visualizing and downloading search results, and expanding the scope of our notebooks to cover more types of searches. This research was supported by NSF-IUSE award number 2142033.more » « lessFree, publicly-accessible full text available April 14, 2026
-
Molecular docking is a computational technique used to predict ligand binding potential, conformation, and location for a given receptor, and is regarded as an attractive method to use in drug design due to its relatively low computational and monetary cost. However, molecular docking programs tend not to be accessible to novice users. Most docking programs require at least a basic knowledge of command line and computer programming to install and configure the program. Additionally, tutorials for the most commonly used programs tend to be inflexible, requiring a specific molecule or set of molecules to be bound to a specific receptor, and need the installation and usage of other programs or websites to download and prepare structures. To increase general access to molecular docking, basil_dock utilizes a series of easy-to-use Jupyter notebooks that do not assume user familiarity with molecular docking procedures and concepts, requiring little command line usage and software installation. The series includes four notebooks that were created to reflect the different steps in the molecular docking process: (1) the preparation of ligand and protein files prior to docking, (2) the docking of ligands to a protein receptor, (3) analyzing the resulting data and determining how different functional groups in the ligand can affect protein-ligand binding, and (4) identifying essential locations for binding within the ligand and protein. The notebooks enable novice users flexibility and customization in exploring docking procedures and systems, as well as teaching users the basis behind molecular docking without having to leave the environment to obtain information and materials from other applications. The first version of basil_dock allows users to choose from receptors uploaded to the Protein Data Bank and to add additional ligands as desired. Users can then select between the Vina and Smina docking engines and change ligand functional groups to see how the substitution of atom groups affects binding affinity and ligand conformation. The data can then be analyzed to determine residues in the receptor and atom groups in the ligand that are likely to be integral to forming the ligand-protein complex and to discern which ligands are likely to be orally bioactive based on Lipinski’s Rule of Five. From this work, a package of python scripts has been created to streamline the generating, splitting, and writing of ligand files, greatly reducing the number of errors arising from attempting to split a comprehensive ligand file manually. Libraries used in basil_dock include Vina, Smina, RDKit, openbabel, and MDAnalysis. While the package has been designed based off the needs of basil_dock, it has been created to be extensible. Support for this project was provided by NSF 2142033more » « lessFree, publicly-accessible full text available April 13, 2026
-
This year, the National Science Foundation (NSF) is celebratingits 75th anniversary. NSF support was essential in the originaldevelopment of BASIL (Biochemistry Authentic Scientific InquiryLab). Ongoing NSF support over the past ten years has enabled the BASILcommunity to grow in numbers and in collaboration with other teacher/scholar teamswho are seeking to change undergraduate biochemistry education. At the same time,NSF support has also provided support for our most critical online resource, theRCSB Protein Data Bank, which has always provided us with the structures that westudy and, increasingly, is providing us with the tools that our students use to explorethese structures and predict their function.more » « lessFree, publicly-accessible full text available March 31, 2026
-
The Protein Data Bank (PDB) holds an extensive amount of information, and can be a vital tool when performing background research for biochemical work. In an attempt to make the information in the PDB more accessible, the RCSB Search API was employed within Jupyter Notebooks to create more customizable and user-friendly tools with simple Python code. Areas of focus include structure motif searches used to predict the function of proteins based on the 3-dimensional shape of their active sites, searches for FDA Approved Drugs, as well as searches targeting ligands with specific characteristics. This code has been built into Jupyter Notebook templates that include both examples of these searches as well as annotated code that users can customize to more efficiently run advanced searches on the PDB and download structure and small molecule files returned by the search. Future plans include increasing the amount and type of information available from a search, as well as expanding the scope of our notebooks to cover more types of searches.more » « lessFree, publicly-accessible full text available March 24, 2026
-
Molecular docking is a computational technique used to predict ligand binding potential, conformation, and location for a given receptor, and is regarded as an attractive method to use in drug design due to its relatively low computational and monetary cost. However, molecular docking programs tend not to be accessible to novice users. To increase general access to molecular docking, basil_dock utilizes a series of easy-to-use Jupyter notebooks that do not assume familiarity with molecular docking procedures and concepts, requiring little command-line usage and software installation. The notebooks, divided based on the different steps in the molecular docking process, focus on user customization and flexibility as well as teaching users the basis behind molecular docking. The first version of basil_dock allows users to choose from receptors uploaded to the Protein Data Bank and to add additional ligands as desired. Users can then select between the Vina and Smina docking engines and change ligand functional groups to see how the substitution of atom groups affects binding affinity and ligand conformation. Machine learning algorithms can then be utilized to determine residues in the receptor and atom groups in the ligand that are likely to be integral to forming the ligand-protein complex and to discern which ligands are likely to be orally bioactive based on Lipinski’s Rule of Five.more » « lessFree, publicly-accessible full text available March 23, 2026
-
In the Biochemistry Authentic Scientific Inquiry Lab (BASIL) course-based undergraduate research experience, students use a series of computational (sequence and structure comparison, docking) and wet lab (protein expression, purification, and concentration; sodium dodecyl sulfate-polyacrylamide gel electrophoresis [SDS-PAGE]; enzyme activity and kinetics) modules to predict and test the function of protein structures of unknown function found in the Protein Data Bank and UniProt. BASIL was established in 2015 with a core of 10 faculty members on six campuses, with the support of an educational researcher and doctoral student on a seventh campus. Since that time, the number of participating faculty members and campuses has grown, and we have adapted our curriculum to improve access for all who are interested. We have also expanded our curriculum to include new developments that are appearing in computational approaches to life science research. In this article, we provide a history of BASIL, explain our current approach, describe how we have addressed challenges that have appeared, and describe our curriculum development pipeline and our plans for moving forward in a sustainable and equitable fashion.more » « lessFree, publicly-accessible full text available January 31, 2026
-
Incorporating Coding into the Classroom: An Important Component of Modern Bioinformatics InstructionAdvancements in computation and machine learning have revolutionized science, enabling researchers to address once insurmountable challenges. Bioinformatics, a field that heavily relies on computer-driven analysis of biological data, has greatly benefited from these developments. However, traditional bioinformatics instruction frequently lacks the necessary coding skills. This article explores the transformation of a bioinformatics course in which feedback from students revealed limitations in traditional web application interfaces and the absence of coding automated pipelines for real-world applications. To address these shortcomings, the authors redesigned the project to incorporate computer programming using Google Colaboratory, where students access databases and websites by coding. The curriculum outlined the integration of modern programming skills with essential bioinformatics concepts. This article evaluates the effectiveness of this redesign by analyzing a selfresponse survey completed by course participants. Results show a positive impact on students’ perception of science and scientific research. Bayesian statistical analysis reveals that the programming component significantly predicts students’ career clarity in science and their pursuit of graduate education. Integrating coding exercises in bioinformatics education enhances students’ preparedness for real-world applications. The freely available GitHub repository will facilitate adoption. By embracing computational tools, students can become adept researchers capable of tackling complex biological questions.more » « lessFree, publicly-accessible full text available January 2, 2026
-
Teaching students how to think like scientists is a critical but challenging goal in biochemistry education. The Biochemistry Authentic Scientific Inquiry Lab (BASIL) initiative was conceived by Dr Paul Craig from the Rochester Institute of Technology and is led by colleagues across multiple institutions. They have developed an innovative curriculum that transforms traditional cookbook-style laboratory courses into authentic research experiences, also known as a Course-based Undergraduate Research Experience (CURE). By investigating real proteins with unknown functions, students learn essential scientific skills while expanding our knowledge of protein biochemistry.more » « lessFree, publicly-accessible full text available January 1, 2026
An official website of the United States government
