Nonlinear state-space models are ubiquitous in modeling real-world dynamical systems. Sequential Monte Carlo (SMC) techniques, also known as particle methods, are a well-known class of parameter estimation methods for this general class of state-space models. Existing SMC-based techniques rely on excessive sampling of the parameter space, which makes their computation intractable for large systems or tall data sets. Bayesian optimization techniques have been used for fast inference in state-space models with intractable likelihoods. These techniques aim to find the maximum of the likelihood function by sequential sampling of the parameter space through a single SMC approximator. Various SMC approximators with different fidelities and computational costs are often available for sample- based likelihood approximation. In this paper, we propose a multi-fidelity Bayesian optimization algorithm for the inference of general nonlinear state-space models (MFBO-SSM), which enables simultaneous sequential selection of parameters and approximators. The accuracy and speed of the algorithm are demonstrated by numerical experiments using synthetic gene expression data from a gene regulatory network model and real data from the VIX stock price index.
more »
« less
PALLAS: Penalized mAximum LikeLihood and pArticle Swarms for Inference of Gene Regulatory Networks from Time Series Data
We present PALLAS, a practical method for gene regulatory network (GRN) inference from time series data, which employs penalized maximum likelihood and particle swarms for optimization. PALLAS is based on the Partially-Observable Boolean Dynamical System (POBDS) model and thus does not require ad-hoc binarization of the data. The penalty in the likelihood is a LASSO regularization term, which encourages the resulting network to be sparse. PALLAS is able to scale to networks of realistic size under no prior knowledge, by virtue of a novel continuous-discrete Fish School Search particle swarm algorithm for efficient simultaneous maximization of the penalized likelihood over the discrete space of networks and the continuous space of observational parameters. The performance of PALLAS is demonstrated by a comprehensive set of experiments using synthetic data generated from real and artificial networks, as well as real time series microarray and RNA-seq data, where it is compared to several other well-known methods for gene regulatory network inference. The results show that PALLAS can infer GRNs more accurately than other methods, while being capable of working directly on gene expression data, without need of ad-hoc binarization. PALLAS is a fully-fledged program, written in python, and available on GitHub (https://github.com/yukuntan92/PALLAS).
more »
« less
- Award ID(s):
- 1718924
- PAR ID:
- 10201432
- Date Published:
- Journal Name:
- IEEE/ACM Transactions on Computational Biology and Bioinformatics
- ISSN:
- 1545-5963
- Page Range / eLocation ID:
- 1 to 1
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
null (Ed.)We propose a new algorithm for inference of protein-protein interaction (PPI) networks from noisy time series of Liquid- Chromatography Mass-Spectrometry (LC-MS) proteomic expression data based on Approximate Bayesian Computation - Sequential Monte Carlo sampling (ABC-SMC). The algorithm is an extension of our previous framework PALLAS. The proposed algorithm can be easily modified to handle other complex models of expression data, such as LC-MS data, for which the likelihood function is intractable. Results based on synthetic time series of cytokine LC-MS measurements cor- responding to a prototype immunomic network demonstrate that our algorithm is capable of inferring the network topology accurately.more » « less
-
We propose a new algorithm for inference of gene regulatory networks (GRN) from noisy gene expression data based on maximum-likelihood (ML) adaptive filtering and the discrete fish school search algorithm (DFSS). The approach is based on the general partially-observed Boolean dynamical system (POBDS) model, and as such can be used for simultaneous state and parameter estimation for any Boolean dynamical system observed in noise. The proposed DFSS-ML-BKF algorithm combines the ML adaptive Boolean Kalman Filter (ML-BKF) with DFSS, a version of the Fish School Search algorithm tailored for discrete parameter spaces. Results based on synthetic gene expression time-series data using the well-known p53-MDM2 negative-feedback loop GRN demonstrate that DFSS-ML-BKF can infer the network topology accurately and efficiently.more » « less
-
Abstract Inferring gene regulatory networks (GRNs) from single-cell data is challenging due to heuristic limitations. Existing methods also lack estimates of uncertainty. Here we present Probabilistic Matrix Factorization for Gene Regulatory Network Inference (PMF-GRN). Using single-cell expression data, PMF-GRN infers latent factors capturing transcription factor activity and regulatory relationships. Using variational inference allows hyperparameter search for principled model selection and direct comparison to other generative models. We extensively test and benchmark our method using real single-cell datasets and synthetic data. We show that PMF-GRN infers GRNs more accurately than current state-of-the-art single-cell GRN inference methods, offering well-calibrated uncertainty estimates.more » « less
-
Count time series are widely encountered in practice. As with continuous valued data, many count series have seasonal properties. This article uses a recent advance in stationary count time series to develop a general seasonal count time series modeling paradigm. The model constructed here permits any marginal distribution for the series and the most flexible autocorrelations possible, including those with negative dependence. Likelihood methods of inference are explored. The article first develops the modeling methods, which entail a discrete transformation of a Gaussian process having seasonal dynamics. Properties of this model class are then established and particle filtering likelihood methods of parameter estimation are developed. A simulation study demonstrating the efficacy of the methods is presented and an application to the number of rainy days in successive weeks in Seattle, Washington is given.more » « less
An official website of the United States government

