skip to main content

Title: The Kernel Interaction Trick: Fast Bayesian Discovery of Pairwise Interactions in High Dimensions
Discovering interaction effects on a response of interest is a fundamental problem faced in biology, medicine, economics, and many other scientific disciplines. In theory, Bayesian methods for discovering pairwise interactions enjoy many benefits such as coherent uncertainty quantification, the ability to incorporate background knowledge, and desirable shrinkage properties. In practice, however, Bayesian methods are often computationally intractable for even moderate- dimensional problems. Our key insight is that many hierarchical models of practical interest admit a Gaussian process representation such that rather than maintaining a posterior over all O(p^2) interactions, we need only maintain a vector of O(p) kernel hyper-parameters. This implicit representation allows us to run Markov chain Monte Carlo (MCMC) over model hyper-parameters in time and memory linear in p per iteration. We focus on sparsity-inducing models and show on datasets with a variety of covariate behaviors that our method: (1) reduces runtime by orders of magnitude over naive applications of MCMC, (2) provides lower Type I and Type II error relative to state-of-the-art LASSO-based approaches, and (3) offers improved computational scaling in high dimensions relative to existing Bayesian and LASSO-based approaches.
; ; ;
Award ID(s):
Publication Date:
Journal Name:
Proceedings of Machine Learning Research
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Motivation

    Advances in experimental and imaging techniques have allowed for unprecedented insights into the dynamical processes within individual cells. However, many facets of intracellular dynamics remain hidden, or can be measured only indirectly. This makes it challenging to reconstruct the regulatory networks that govern the biochemical processes underlying various cell functions. Current estimation techniques for inferring reaction rates frequently rely on marginalization over unobserved processes and states. Even in simple systems this approach can be computationally challenging, and can lead to large uncertainties and lack of robustness in parameter estimates. Therefore we will require alternative approaches to efficiently uncover themore »interactions in complex biochemical networks.


    We propose a Bayesian inference framework based on replacing uninteresting or unobserved reactions with time delays. Although the resulting models are non-Markovian, recent results on stochastic systems with random delays allow us to rigorously obtain expressions for the likelihoods of model parameters. In turn, this allows us to extend MCMC methods to efficiently estimate reaction rates, and delay distribution parameters, from single-cell assays. We illustrate the advantages, and potential pitfalls, of the approach using a birth–death model with both synthetic and experimental data, and show that we can robustly infer model parameters using a relatively small number of measurements. We demonstrate how to do so even when only the relative molecule count within the cell is measured, as in the case of fluorescence microscopy.

    Availability and implementation

    Accompanying code in R is available at

    Supplementary information

    Supplementary data are available at Bioinformatics online.

    « less
  2. Dielectric elastomers are employed for a wide variety of adaptive structures. Many of these soft elastomers exhibit significant rate-dependencies in their response. Accurately quantifying this viscoelastic behavior is non-trivial and in many cases a nonlinear modeling framework is required. Fractional-order operators have been applied to modeling viscoelastic behavior for many years, and recent research has shown fractional-order methods to be effective for nonlinear frameworks. This implementation can become computationally expensive to achieve an accurate approximation of the fractional-order derivative. Accurate estimation of the elastomer’s viscoelastic behavior to quantify parameter uncertainty motivates the use of Markov Chain Monte Carlo (MCMC) methods.more »Since MCMC is a sampling based method, requiring many model evaluations, efficient estimation of the fractional derivative operator is crucial. In this paper, we demonstrate the effectiveness of using quadrature techniques to approximate the Riemann–Liouville definition for fractional derivatives in the context of estimating the uncertainty of a nonlinear viscoelastic model. We also demonstrate the use of parameter subset selection techniques to isolate parameters that are identifiable in the sense that they are uniquely determined by measured data. For those identifiable parameters, we employ Bayesian inference to compute posterior distributions for parameters. Finally, we propagate parameter uncertainties through the models to compute prediction intervals for quantities of interest.« less
  3. Many problems in marketing and economics require firms to make targeted consumer-specific decisions, but current estimation methods are not designed to scale to the size of modern data sets. In this article, the authors propose a new algorithm to close that gap. They develop a distributed Markov chain Monte Carlo (MCMC) algorithm for estimating Bayesian hierarchical models when the number of consumers is very large and the objects of interest are the consumer-level parameters. The two-stage and embarrassingly parallel algorithm is asymptotically unbiased in the number of consumers, retains the flexibility of a standard MCMC algorithm, and is easy tomore »implement. The authors show that the distributed MCMC algorithm is faster and more efficient than a single-machine algorithm by at least an order of magnitude. They illustrate the approach with simulations with up to 100 million consumers, and with data on 1,088,310 donors to a charitable organization. The algorithm enables an increase of between $1.6 million and $4.6 million in additional donations when applied to a large modern-size data set compared with a typical-size data set.

    « less
  4. Graphs have been commonly used to represent complex data structures. In models dealing with graph-structured data, multivariate parameters may not only exhibit sparse patterns but have structured sparsity and smoothness in the sense that both zero and non-zero parameters tend to cluster together. We propose a new prior for high-dimensional parameters with graphical relations, referred to as the Tree-based Low-rank Horseshoe (T-LoHo) model, that generalizes the popular univariate Bayesian horseshoe shrinkage prior to the multivariate setting to detect structured sparsity and smoothness simultaneously. The T-LoHo prior can be embedded in many high-dimensional hierarchical models. To illustrate its utility, we applymore »it to regularize a Bayesian high-dimensional regression problem where the regression coefficients are linked by a graph, so that the resulting clusters have flexible shapes and satisfy the cluster contiguity constraint with respect to the graph. We design an efficient Markov chain Monte Carlo algorithm that delivers full Bayesian inference with uncertainty measures for model parameters such as the number of clusters. We offer theoretical investigations of the clustering effects and posterior concentration results. Finally, we illustrate the performance of the model with simulation studies and a real data application for anomaly detection on a road network. The results indicate substantial improvements over other competing methods such as the sparse fused lasso.« less

    We present a new method of matching observations of Type-I (thermonuclear) X-ray bursts with models, comparing the predictions of a semi-analytic ignition model with X-ray observations of the accretion-powered millisecond pulsar SAX J1808.4–3658 in outburst. We used a Bayesian analysis approach to marginalize over the parameters of interest and determine parameters such as fuel composition, distance/anisotropy factors, neutron star mass, and neutron star radius. Our study includes a treatment of the system inclination effects, inferring that the rotation axis of the system is inclined $\left(69^{+4}_{-2}\right)^\circ$ from the observers line of sight, assuming a flat disc model. This method canmore »be applied to any accreting source that exhibits Type-I X-ray bursts. We find a hydrogen mass fraction of $0.57^{+0.13}_{-0.14}$ and CNO metallicity of $0.013^{+0.006}_{-0.004}$ for the accreted fuel is required by the model to match the observed burst energies, for a distance to the source of $3.3^{+0.3}_{-0.2}\, \mathrm{kpc}$. We infer a neutron star mass of $1.5^{+0.6}_{-0.3}\, \mathrm{M}_{\odot }$ and radius of $11.8^{+1.3}_{-0.9}\, \mathrm{km}$ for a surface gravity of $1.9^{+0.7}_{-0.4}\times 10^{14}\, \mathrm{cm}\, \mathrm{s}^{-2}$ for SAX J1808.4–3658.

    « less