Abstract MotivationCell function is regulated by gene regulatory networks (GRNs) defined by protein-mediated interaction between constituent genes. Despite advances in experimental techniques, we can still measure only a fraction of the processes that govern GRN dynamics. To infer the properties of GRNs using partial observation, unobserved sequential processes can be replaced with distributed time delays, yielding non-Markovian models. Inference methods based on the resulting model suffer from the curse of dimensionality. ResultsWe develop a simulation-based Bayesian MCMC method employing an approximate likelihood for the efficient and accurate inference of GRN parameters when only some of their products are observed. We illustrate our approach using a two-step activation model: an activation signal leads to the accumulation of an unobserved regulatory protein, which triggers the expression of observed fluorescent proteins. With prior information about observed fluorescent protein synthesis, our method successfully infers the dynamics of the unobserved regulatory protein. We can estimate the delay and kinetic parameters characterizing target regulation including transcription, translation, and target searching of an unobserved protein from experimental measurements of the products of its target gene. Our method is scalable and can be used to analyze non-Markovian models with hidden components. Availability and implementationOur code is implemented in R and is freely available with a simple example data at https://github.com/Mathbiomed/SimMCMC.
more »
« less
Implementation of an Efficient Bayesian Search for Gravitational-wave Bursts with Memory in Pulsar Timing Array Data
Abstract The standard Bayesian technique for searching pulsar timing data for gravitational-wave bursts with memory (BWMs) using Markov Chain Monte Carlo (MCMC) sampling is very computationally expensive to perform. In this paper, we explain the implementation of an efficient Bayesian technique for searching for BWMs. This technique makes use of the fact that the signal model for Earth-term BWMs (BWMs passing over the Earth) is fully factorizable. We estimate that this implementation reduces the computational complexity by a factor of 100. We also demonstrate that this technique gives upper limits consistent with published results using the standard Bayesian technique, and may be used to perform all of the same analyses of BWMs that standard MCMC techniques can perform.
more »
« less
- Award ID(s):
- 2020265
- PAR ID:
- 10501188
- Publisher / Repository:
- Astrophysical Journal
- Date Published:
- Journal Name:
- The Astrophysical Journal
- Volume:
- 951
- Issue:
- 2
- ISSN:
- 0004-637X
- Page Range / eLocation ID:
- 121
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract We consider the simulation of Bayesian statistical inverse problems governed by large-scale linear and nonlinear partial differential equations (PDEs). Markov chain Monte Carlo (MCMC) algorithms are standard techniques to solve such problems. However, MCMC techniques are computationally challenging as they require a prohibitive number of forward PDE solves. The goal of this paper is to introduce a fractional deep neural network (fDNN) based approach for the forward solves within an MCMC routine. Moreover, we discuss some approximation error estimates. We illustrate the efficiency of fDNN on inverse problems governed by nonlinear elliptic PDEs and the unsteady Navier–Stokes equations. In the former case, two examples are discussed, respectively depending on two and 100 parameters, with significant observed savings. The unsteady Navier–Stokes example illustrates that fDNN can outperform existing DNNs, doing a better job of capturing essential features such as vortex shedding.more » « less
-
Koyejo, S.; Mohamed, S.; Agarwal, A.; Belgrave, D.; Cho, K.; Oh, A. (Ed.)Differentially private mechanisms protect privacy by introducing additional randomness into the data. Restricting access to only the privatized data makes it challenging to perform valid statistical inference on parameters underlying the confidential data. Specifically, the likelihood function of the privatized data requires integrating over the large space of confidential databases and is typically intractable. For Bayesian analysis, this results in a posterior distribution that is doubly intractable, rendering traditional MCMC techniques inapplicable. We propose an MCMC framework to perform Bayesian inference from the privatized data, which is applicable to a wide range of statistical models and privacy mechanisms. Our MCMC algorithm augments the model parameters with the unobserved confidential data, and alternately updates each one conditional on the other. For the potentially challenging step of updating the confidential data, we propose a generic approach that exploits the privacy guarantee of the mechanism to ensure efficiency. We give results on the computational complexity, acceptance rate, and mixing properties of our MCMC. We illustrate the efficacy and applicability of our methods on a naive-Bayes log-linear model and on a linear regression model.more » « less
-
Pupko, Tal (Ed.)Abstract Nearly all current Bayesian phylogenetic applications rely on Markov chain Monte Carlo (MCMC) methods to approximate the posterior distribution for trees and other parameters of the model. These approximations are only reliable if Markov chains adequately converge and sample from the joint posterior distribution. Although several studies of phylogenetic MCMC convergence exist, these have focused on simulated data sets or select empirical examples. Therefore, much that is considered common knowledge about MCMC in empirical systems derives from a relatively small family of analyses under ideal conditions. To address this, we present an overview of commonly applied phylogenetic MCMC diagnostics and an assessment of patterns of these diagnostics across more than 18,000 empirical analyses. Many analyses appeared to perform well and failures in convergence were most likely to be detected using the average standard deviation of split frequencies, a diagnostic that compares topologies among independent chains. Different diagnostics yielded different information about failed convergence, demonstrating that multiple diagnostics must be employed to reliably detect problems. The number of taxa and average branch lengths in analyses have clear impacts on MCMC performance, with more taxa and shorter branches leading to more difficult convergence. We show that the usage of models that include both Γ-distributed among-site rate variation and a proportion of invariable sites is not broadly problematic for MCMC convergence but is also unnecessary. Changes to heating and the usage of model-averaged substitution models can both offer improved convergence in some cases, but neither are a panacea.more » « less
-
Koyejo, Sanmi; Mohamed, Shakir (Ed.)Differentially private mechanisms protect privacy by introducing additional randomness into the data. Restricting access to only the privatized data makes it challenging to perform valid statistical inference on parameters underlying the confidential data. Specifically, the likelihood function of the privatized data requires integrating over the large space of confidential databases and is typically intractable. For Bayesian analysis, this results in a posterior distribution that is doubly intractable, rendering traditional MCMC techniques inapplicable. We propose an MCMC framework to perform Bayesian inference from the privatized data, which is applicable to a wide range of statistical models and privacy mechanisms. Our MCMC algorithm augments the model parameters with the unobserved confidential data, and alternately updates each one conditional on the other. For the potentially challenging step of updating the confidential data, we propose a generic approach that exploits the privacy guarantee of the mechanism to ensure efficiency. In particular, we give results on the computational complexity, acceptance rate, and mixing properties of our MCMC. We illustrate the efficacy and applicability of our methods on a na\"ive-Bayes log-linear model as well as on a linear regression model.more » « less