skip to main content

Attention:

The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 5:00 PM ET until 11:00 PM ET on Friday, June 21 due to maintenance. We apologize for the inconvenience.


Search for: All records

Creators/Authors contains: "Karanicolas, John"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. null (Ed.)
    With the recent explosion in the size of libraries available for screening, virtual screening is positioned to assume a more prominent role in early drug discovery’s search for active chemical matter. In typical virtual screens, however, only about 12% of the top-scoring compounds actually show activity when tested in biochemical assays. We argue that most scoring functions used for this task have been developed with insufficient thoughtfulness into the datasets on which they are trained and tested, leading to overly simplistic models and/or overtraining. These problems are compounded in the literature because studies reporting new scoring methods have not validated their models prospectively within the same study. Here, we report a strategy for building a training dataset (D-COID) that aims to generate highly compelling decoy complexes that are individually matched to available active complexes. Using this dataset, we train a general-purpose classifier for virtual screening (vScreenML) that is built on the XGBoost framework. In retrospective benchmarks, our classifier shows outstanding performance relative to other scoring functions. In a prospective context, nearly all candidate inhibitors from a screen against acetylcholinesterase show detectable activity; beyond this, 10 of 23 compounds have IC 50 better than 50 μM. Without any medicinal chemistry optimization, the most potent hit has IC 50 280 nM, corresponding to K i of 173 nM. These results support using the D-COID strategy for training classifiers in other computational biology tasks, and for vScreenML in virtual screening campaigns against other protein targets. Both D-COID and vScreenML are freely distributed to facilitate such efforts. 
    more » « less
  2. Water engages in two important types of interactions near biomolecules: it forms ordered “cages” around exposed hydrophobic regions, and it participates in hydrogen bonds with surface polar groups. Both types of interaction are critical to biomolecular structure and function, but explicitly including an appropriate number of solvent molecules makes many applications computationally intractable. A number of implicit solvent models have been developed to address this problem, many of which treat these two solvation effects separately. Here, we describe a new model to capture polar solvation effects, called SHO (“solvent hydrogen‐bond occlusion”); our model aims to directly evaluate the energetic penalty associated with displacing discrete first‐shell water molecules near each solute polar group. We have incorporated SHO into the Rosetta energy function, and find that scoring protein structures with SHO provides superior performance in loop modeling, virtual screening, and protein structure prediction benchmarks. These improvements stem from the fact that SHO accurately identifies and penalizes polar groups that do not participate in hydrogen bonds, either with solvent or with other solute atoms (“unsatisfied” polar groups). We expect that in future, SHO will enable higher‐resolution predictions for a variety of molecular modeling applications. © 2017 Wiley Periodicals, Inc.

     
    more » « less