skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Ramsey, Stephen A."

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. We describe a novel computational approach, CERENKOV (Computational Elucidation of the REgulatory NonKOd- ing Variome), for discriminating regulatory single nucleotide polymorphisms (rSNPs) from non-regulatory SNPs within noncoding genetic loci. CERENKOV is specifically designed for recognizing rSNPs in the context of a post-analysis of a genome-wide association study (GWAS); it includes a novel accuracy scoring metric (which we call average rank, or AV- GRANK) and a novel cross-validation strategy (locus-based sampling) that both correctly account for the “sparse positive bag” nature of the GWAS post-analysis rSNP recognition problem. We trained and validated CERENKOV using a reference set of 15,331 SNPs (the OSU17 SNP set) whose composition is based on selection criteria (linkage disequi- librium and minor allele frequency) that we designed to ensure relevance to GWAS post-analysis. CERENKOV is based on a machine-learning algorithm (gradient boosted decision trees) incorporating 246 SNP annotation features that we extracted from genomic, epigenomic, phylogenetic, and chromatin datasets. CERENKOV includes novel features based on replication timing and DNA shape. We found that tuning a classifier for AUPVR performance does not guaran- tee optimality for AVGRANK. We compared the validation performance of CERENKOV to nine other methods for rSNP recognition (including GWAVA, RSVP, DeltaSVM, DeepSEA, Eigen, and DANQ), and found that CERENKOV’s validation performance is the strongest out of all of the classifiers that we tested, by both traditional global rank-based measures (⟨AUPVR⟩ = 0.506; ⟨AUROC⟩ = 0.855) and AVGRANK (⟨AVGRANK⟩ = 3.877). The source code for CERENKOV is available on GitHub and the SNP feature data files are available for download via the CERENKOV website. 
    more » « less