Spike-and-Slab Posterior Sampling in High Dimensions

Kumar, S; Sarkar, P; Tian, K; Zhu, Y

Posterior sampling with the spike-and-slab prior [MB88], a popular multimodal distribution used to model uncertainty in variable selection, is considered the theoretical gold standard method for Bayesian sparse linear regression [CPS09, Roc18]. However, designing provable algorithms for performing this sampling task is notoriously challenging. Existing posterior samplers for Bayesian sparse variable selection tasks either require strong assumptions about the signal-to-noise ratio (SNR) [YWJ16], only work when the measurement count grows at least linearly in the dimension [MW24], or rely on heuristic approximations to the posterior. We give the first provable algorithms for spike-and-slab posterior sampling that apply for any SNR, and use a measurement count sublinear in the problem dimension. Concretely, assume we are given a measurement matrix X∈ℝn×d and noisy observations y=Xθ⋆+ξ of a signal θ⋆ drawn from a spike-and-slab prior π with a Gaussian diffuse density and expected sparsity k, where ξ∼(𝟘n,σ2In). We give a polynomial-time high-accuracy sampler for the posterior π(⋅∣X,y), for any SNR σ−1 > 0, as long as n≥k3⋅polylog(d) and X is drawn from a matrix ensemble satisfying the restricted isometry property. We further give a sampler that runs in near-linear time ≈nd in the same setting, as long as n≥k5⋅polylog(d). To demonstrate the flexibility of our framework, we extend our result to spike-and-slab posterior sampling with Laplace diffuse densities, achieving similar guarantees when σ=O(1k) is bounded.

More Like this