Implementing and evaluating a Gaussian mixture framework for identifying gene function from TnSeq data

Li, K; Chen, R; Lindsey, W; Best, AA; DeJongh, M; Henry, C; Tintle, NL

Citation Details

The rapid acceleration of microbial genome sequencing increases opportunities to understand bacterial gene function. Unfortunately, only a small proportion of genes have been studied. Recently, TnSeq has been proposed as a cost-effective, highly reliable approach to predict gene functions as a response to changes in a cell’s fitness before-after genomic changes. However, major questions remain about how to best determine whether an observed quantitative change in fitness represents a meaningful change. To address the limitation, we develop a Gaussian mixture model framework for classifying gene function from TnSeq experiments. In order to implement the mixture model, we present the Expectation-Maximization algorithm and a hierarchical Bayesian model sampled using Stan’s Hamiltonian Monte-Carlo sampler. We compare these implementations against the frequentist method used in current TnSeq literature. From simulations and real data produced by E.coli TnSeq experiments, we show that the Bayesian implementation of the Gaussian mixture framework provides the most consistent classification results. more »

Award ID(s):: 1716285

PAR ID:: 10120127

Author(s) / Creator(s):: Li, K; Chen, R; Lindsey, W; Best, AA; DeJongh, M; Henry, C; Tintle, NL

Date Published:: 2019-01-01

Journal Name:: Pacific symposium on biocomputing ...

Volume:: 24

ISSN:: 2335-6936

Page Range / eLocation ID:: 172-183

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.

More Like this