ProSampler: an ultrafast and accurate motif finder in large ChIP-seq datasets for combinatory motif discovery

Li, Yang; Ni, Pengyu; Zhang, Shaoqiang; Li, Guojun; Su, Zhengchang  (ORCID:0000000346363440); Berger, ed., Bonnie

doi:10.1093/bioinformatics/btz290

Citation Details

ProSampler: an ultrafast and accurate motif finder in large ChIP-seq datasets for combinatory motif discovery

Abstract MotivationThe availability of numerous ChIP-seq datasets for transcription factors (TF) has provided an unprecedented opportunity to identify all TF binding sites in genomes. However, the progress has been hindered by the lack of a highly efficient and accurate tool to find not only the target motifs, but also cooperative motifs in very big datasets. ResultsWe herein present an ultrafast and accurate motif-finding algorithm, ProSampler, based on a novel numeration method and Gibbs sampler. ProSampler runs orders of magnitude faster than the fastest existing tools while often more accurately identifying motifs of both the target TFs and cooperators. Thus, ProSampler can greatly facilitate the efforts to identify the entire cis-regulatory code in genomes. Availability and implementationSource code and binaries are freely available for download at https://github.com/zhengchangsulab/prosampler. It was implemented in C++ and supported on Linux, macOS and MS Windows platforms. Supplementary informationSupplementary materials are available at Bioinformatics online. more »

Award ID(s):: 1661332

PAR ID:: 10124056

Author(s) / Creator(s):: Li, Yang ; Ni, Pengyu ; Zhang, Shaoqiang ; Li, Guojun ; Su, Zhengchang ; Berger, ed., Bonnie

Publisher / Repository:: Oxford University Press

Date Published:: 2019-05-09

Journal Name:: Bioinformatics

Volume:: 35

Issue:: 22

ISSN:: 1367-4803

Page Range / eLocation ID:: p. 4632-4639

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Journal Article:
https://doi.org/10.1093/bioinformatics/btz290

More Like this