Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Abstract New drug production, from target identification to marketing approval, takes over 12 years and can cost around $2.6 billion. Furthermore, the COVID-19 pandemic has unveiled the urgent need for more powerful computational methods for drug discovery. Here, we review the computational approaches to predicting protein–ligand interactions in the context of drug discovery, focusing on methods using artificial intelligence (AI). We begin with a brief introduction to proteins (targets), ligands (e.g. drugs) and their interactions for nonexperts. Next, we review databases that are commonly used in the domain of protein–ligand interactions. Finally, we survey and analyze the machine learning (ML) approaches implemented to predict protein–ligand binding sites, ligand-binding affinity and binding pose (conformation) including both classical ML algorithms and recent deep learning methods. After exploring the correlation between these three aspects of protein–ligand interaction, it has been proposed that they should be studied in unison. We anticipate that our review will aid exploration and development of more accurate ML-based prediction strategies for studying protein–ligand interactions.more » « less
-
Two amino acid variants in soybean serine hydroxymethyltransferase 8 (SHMT8) are associated with resistance to the soybean cyst nematode (SCN), a devastating agricultural pathogen with worldwide economic impacts on soybean production. SHMT8 is a cytoplasmic enzyme that catalyzes the pyridoxal 5‐phosphate‐dependent conversion of serine and tetrahydrofolate (THF) to glycine and 5,10‐methylenetetrahydrofolate. A previous study of the P130R/N358Y double variant of SHMT8, identified in the SCN‐resistant soybean cultivar (cv.) Forrest, showed profound impairment of folate binding affinity and reduced THF‐dependent enzyme activity, relative to the highly active SHMT8 in cv. Essex, which is susceptible to SCN. Given the importance of SCN‐resistance in soybean agriculture, we report here the biochemical and structural characterization of the P130R and N358Y single variants to elucidate their individual effects on soybean SHMT8. We find that both single variants have reduced THF‐dependent catalytic activity relative to Essex SHMT8 (10‐ to 50‐fold decrease in
k cat/K m) but are significantly more active than the P130R/N368Y double variant. The kinetic data also show that the single variants lack THF‐substrate inhibition as found in Essex SHMT8, an observation with implications for regulation of the folate cycle. Five crystal structures of the P130R and N358Y variants in complex with various ligands (resolutions from 1.49 to 2.30 Å) reveal distinct structural impacts of the mutations and provide new insights into allosterism. Our results support the notion that the P130R/N358Y double variant in Forrest SHMT8 produces unique and unexpected effects on the enzyme, which cannot be easily predicted from the behavior of the individual variants. -
null (Ed.)Abstract Background Cryo-electron microscopy (Cryo-EM) is widely used in the determination of the three-dimensional (3D) structures of macromolecules. Particle picking from 2D micrographs remains a challenging early step in the Cryo-EM pipeline due to the diversity of particle shapes and the extremely low signal-to-noise ratio of micrographs. Because of these issues, significant human intervention is often required to generate a high-quality set of particles for input to the downstream structure determination steps. Results Here we propose a fully automated approach (DeepCryoPicker) for single particle picking based on deep learning. It first uses automated unsupervised learning to generate particle training datasets. Then it trains a deep neural network to classify particles automatically. Results indicate that the DeepCryoPicker compares favorably with semi-automated methods such as DeepEM, DeepPicker, and RELION, with the significant advantage of not requiring human intervention. Conclusions Our framework combing supervised deep learning classification with automated un-supervised clustering for generating training data provides an effective approach to pick particles in cryo-EM images automatically and accurately.more » « less
-
Recent advances in mass spectrometry (MS)-based proteomics have enabled tremendous progress in the understanding of cellular mechanisms, disease progression, and the relationship between genotype and phenotype. Though many popular bioinformatics methods in proteomics are derived from other omics studies, novel analysis strategies are required to deal with the unique characteristics of proteomics data. In this review, we discuss the current developments in the bioinformatics methods used in proteomics and how they facilitate the mechanistic understanding of biological processes. We first introduce bioinformatics software and tools designed for mass spectrometry-based protein identification and quantification, and then we review the different statistical and machine learning methods that have been developed to perform comprehensive analysis in proteomics studies. We conclude with a discussion of how quantitative protein data can be used to reconstruct protein interactions and signaling networks.more » « less
-
Abstract Many proteins are composed of several domains that pack together into a complex tertiary structure. Multidomain proteins can be challenging for protein structure modeling, particularly those for which templates can be found for individual domains but not for the entire sequence. In such cases, homology modeling can generate high quality models of the domains but not for the orientations between domains. Small‐angle X‐ray scattering (SAXS) reports the structural properties of entire proteins and has the potential for guiding homology modeling of multidomain proteins. In this article, we describe a novel multidomain protein assembly modeling method, SAXSDom that integrates experimental knowledge from SAXS with probabilistic Input‐Output Hidden Markov model to assemble the structures of individual domains together. Four SAXS‐based scoring functions were developed and tested, and the method was evaluated on multidomain proteins from two public datasets. Incorporation of SAXS information improved the accuracy of domain assembly for 40 out of 46 critical assessment of protein structure prediction multidomain protein targets and 45 out of 73 multidomain protein targets from the ab initio domain assembly dataset. The results demonstrate that SAXS data can provide useful information to improve the accuracy of domain‐domain assembly. The source code and tool packages are available at
https://github.com/jianlin-cheng/SAXSDom .