Structure-based virtual screening is a key tool in early drug discovery, with growing interest in the screening of multi-billion chemical compound libraries. However, the success of virtual screening crucially depends on the accuracy of the binding pose and binding affinity predicted by computational docking. Here we develop a highly accurate structure-based virtual screen method, RosettaVS, for predicting docking poses and binding affinities. Our approach outperforms other state-of-the-art methods on a wide range of benchmarks, partially due to our ability to model receptor flexibility. We incorporate this into a new open-source artificial intelligence accelerated virtual screening platform for drug discovery. Using this platform, we screen multi-billion compound libraries against two unrelated targets, a ubiquitin ligase target KLHDC2 and the human voltage-gated sodium channel NaV1.7. For both targets, we discover hit compounds, including seven hits (14% hit rate) to KLHDC2 and four hits (44% hit rate) to NaV1.7, all with single digit micromolar binding affinities. Screening in both cases is completed in less than seven days. Finally, a high resolution X-ray crystallographic structure validates the predicted docking pose for the KLHDC2 ligand complex, demonstrating the effectiveness of our method in lead discovery.
more » « less- PAR ID:
- 10540087
- Publisher / Repository:
- Nature Publishing Group
- Date Published:
- Journal Name:
- Nature Communications
- Volume:
- 15
- Issue:
- 1
- ISSN:
- 2041-1723
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
null (Ed.)Abstract In this study, we developed a novel algorithm to improve the screening performance of an arbitrary docking scoring function by recalibrating the docking score of a query compound based on its structure similarity with a set of training compounds, while the extra computational cost is neglectable. Two popular docking methods, Glide and AutoDock Vina were adopted as the original scoring functions to be processed with our new algorithm and similar improvement performance was achieved. Predicted binding affinities were compared against experimental data from ChEMBL and DUD-E databases. 11 representative drug receptors from diverse drug target categories were applied to evaluate the hybrid scoring function. The effects of four different fingerprints (FP2, FP3, FP4, and MACCS) and the four different compound similarity effect (CSE) functions were explored. Encouragingly, the screening performance was significantly improved for all 11 drug targets especially when CSE = S 4 (S is the Tanimoto structural similarity) and FP2 fingerprint were applied. The average predictive index (PI) values increased from 0.34 to 0.66 and 0.39 to 0.71 for the Glide and AutoDock vina scoring functions, respectively. To evaluate the performance of the calibration algorithm in drug lead identification, we also imposed an upper limit on the structural similarity to mimic the real scenario of screening diverse libraries for which query ligands are general-purpose screening compounds and they are not necessarily structurally similar to reference ligands. Encouragingly, we found our hybrid scoring function still outperformed the original docking scoring function. The hybrid scoring function was further evaluated using external datasets for two systems and we found the PI values increased from 0.24 to 0.46 and 0.14 to 0.42 for A2AR and CFX systems, respectively. In a conclusion, our calibration algorithm can significantly improve the virtual screening performance in both drug lead optimization and identification phases with neglectable computational cost.more » « less
-
null (Ed.)As fragment-based drug discovery has become mainstream, there has been an increase in various screening methodologies. Protein-observed 19F (PrOF) NMR and 1H CPMG NMR are two fragment screening assays that have complementary advantages. Here, we sought to combine these two NMR-based assays into a new screening workflow. This combination of protein- and ligand-observed experiments allows for a time- and resource-efficient multiplexed screen of mixtures of fragments and proteins. PrOF NMR is first used to screen mixtures against two proteins. Hit mixtures for each protein are identified then deconvoluted using 1H CPMG NMR. We demonstrate the benefit of this fragment screening method by conducting the first reported fragment screens against the bromodomains of BPTF and Plasmodium falciparum (Pf) GCN5 using 467 3D-enriched fragments. The hit rates were 6%, 5% and 4% for fragments binding BPTF, PfGCN5, and fragments binding both proteins, respectively. Select hits were characterized, revealing a broad range of affinities from low µM to mM dissociation constants. Follow-up experiments supported a low-affinity second binding site on PfGCN5. This approach can be used to bias fragment screens towards more selective hits at the onset of inhibitor development in a resource- and time-efficient manner.more » « less
-
The CACHE challenges are a series of prospective benchmarking exercises to evaluate progress in the field of computational hit-finding. Here we report the results of the inaugural CACHE challenge in which 23 computational teams each selected up to 100 commercially available compounds that they predicted would bind to the WDR domain of the Parkinson’s disease target LRRK2, a domain with no known ligand and only an apo structure in the PDB. The lack of known binding data and presumably low druggability of the target is a challenge to computational hit finding methods. Of the 1955 molecules predicted by participants in Round 1 of the challenge, 73 were found to bind to LRRK2 in an SPR assay with a KD lower than 150 μM. These 73 molecules were advanced to the Round 2 hit expansion phase, where computational teams each selected up to 50 analogs. Binding was observed in two orthogonal assays for seven chemically diverse series, with affinities ranging from 18 to 140 μM. The seven successful computational workflows varied in their screening strategies and techniques. Three used molecular dynamics to produce a conformational ensemble of the targeted site, three included a fragment docking step, three implemented a generative design strategy and five used one or more deep learning steps. CACHE #1 reflects a highly exploratory phase in computational drug design where participants adopted strikingly diverging screening strategies. Machine learning-accelerated methods achieved similar results to brute force (e.g., exhaustive) docking. First-in-class, experimentally confirmed compounds were rare and weakly potent, indicating that recent advances are not sufficient to effectively address challenging targets.more » « less
-
Abstract The continuous rise of multi-drug resistant pathogenic bacteria has become a significant challenge for the health care system. In particular, novel drugs to treat infections of methicillin-resistant Staphylococcus aureus strains (MRSA) are needed, but traditional drug discovery campaigns have largely failed to deliver clinically suitable antibiotics. More than simply new drugs, new drug discovery approaches are needed to combat bacterial resistance. The recently described phenomenon of copper-dependent inhibitors has galvanized research exploring the use of metal-coordinating molecules to harness copper’s natural antibacterial properties for therapeutic purposes. Here, we describe the results of the first concerted screening effort to identify copper-dependent inhibitors of Staphylococcus aureus. A standard library of 10 000 compounds was assayed for anti-staphylococcal activity, with hits defined as those compounds with a strict copper-dependent inhibitory activity. A total of 53 copper-dependent hit molecules were uncovered, similar to the copper independent hit rate of a traditionally executed campaign conducted in parallel on the same library. Most prominent was a hit family with an extended thiourea core structure, termed the NNSN motif. This motif resulted in copper-dependent and copper-specific S. aureus inhibition, while simultaneously being well tolerated by eukaryotic cells. Importantly, we could demonstrate that copper binding by the NNSN motif is highly unusual and likely responsible for the promising biological qualities of these compounds. A subsequent chemoinformatic meta-analysis of the ChEMBL chemical database confirmed the NNSNs as an unrecognized staphylococcal inhibitor, despite the family’s presence in many chemical screening libraries. Thus, our copper-biased screen has proven able to discover inhibitors within previously screened libraries, offering a mechanism to reinvigorate exhausted molecular collections.
-
Virtual screening is a cost- and time-effective alternative to traditional high-throughput screening in the drug discovery process. Both virtual screening approaches, structure-based molecular docking and ligand-based cheminformatics, suffer from computational cost, low accuracy, and/or reliance on prior knowledge of a ligand that binds to a given target. Here, we propose a neural network framework, NeuralDock, which accelerates the process of high-quality computational docking by a factor of 10 6 , and does not require prior knowledge of a ligand that binds to a given target. By approximating both protein-small molecule conformational sampling and energy-based scoring, NeuralDock accurately predicts the binding energy, and affinity of a protein-small molecule pair, based on protein pocket 3D structure and small molecule topology. We use NeuralDock and 25 GPUs to dock 937 million molecules from the ZINC database against superoxide dismutase-1 in 21 h, which we validate with physical docking using MedusaDock. Due to its speed and accuracy, NeuralDock may be useful in brute-force virtual screening of massive chemical libraries and training of generative drug models.more » « less