skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on January 1, 2026

Title: Exact sampling of spanning trees via fast-forwarded random walks
Summary Tree graphs are used routinely in statistics. When estimating a Bayesian model with a tree component, sampling the posterior remains a core difficulty. Existing Markov chain Monte Carlo methods tend to rely on local moves, often leading to poor mixing. A promising approach is to instead directly sample spanning trees on an auxiliary graph. Current spanning tree samplers, such as the celebrated Aldous–Broder algorithm, rely predominantly on simulating random walks that are required to visit all the nodes of the graph. Such algorithms are prone to getting stuck in certain subgraphs. We formalize this phenomenon using the bottlenecks in the random walk’s transition probability matrix. We then propose a novel fast-forwarded cover algorithm that can break free from bottlenecks. The core idea is a marginalization argument that leads to a closed-form expression that allows for fast-forwarding to the event of visiting a new node. Unlike many existing approximation algorithms, our algorithm yields exact samples. We demonstrate the enhanced efficiency of the fast-forwarded cover algorithm, and illustrate its application in fitting a Bayesian dendrogram model on a Massachusetts crime and community dataset.  more » « less
Award ID(s):
2319551
PAR ID:
10630766
Author(s) / Creator(s):
; ;
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
Biometrika
Volume:
112
Issue:
2
ISSN:
1464-3510
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Nonparametric regression on complex domains has been a challenging task as most existing methods, such as ensemble models based on binary decision trees, are not designed to account for intrinsic geometries and domain boundaries. This article proposes a Bayesian additive regression spanning trees (BAST) model for nonparametric regression on manifolds, with an emphasis on complex constrained domains or irregularly shaped spaces embedded in Euclidean spaces. Our model is built upon a random spanning tree manifold partition model as each weak learner, which is capable of capturing any irregularly shaped spatially contiguous partitions while respecting intrinsic geometries and domain boundary constraints. Utilizing many nice properties of spanning tree structures, we design an efficient Bayesian inference algorithm. Equipped with a soft prediction scheme, BAST is demonstrated to significantly outperform other competing methods in simulation experiments and in an application to the chlorophyll data in Aral Sea, due to its strong local adaptivity to different levels of smoothness. 
    more » « less
  2. We show variants of spectral sparsification routines can preserve the total spanning tree counts of graphs, which by Kirchhoff's matrix-tree theorem, is equivalent to determinant of a graph Laplacian minor, or equivalently, of any SDDM matrix. Our analyses utilizes this combinatorial connection to bridge between statistical leverage scores / effective resistances and the analysis of random graphs by [Janson, Combinatorics, Probability and Computing `94]. This leads to a routine that in quadratic time, sparsifies a graph down to about $$n^{1.5}$$ edges in ways that preserve both the determinant and the distribution of spanning trees (provided the sparsified graph is viewed as a random object). Extending this algorithm to work with Schur complements and approximate Choleksy factorizations leads to algorithms for counting and sampling spanning trees which are nearly optimal for dense graphs. We give an algorithm that computes a $$(1 \pm \delta)$$ approximation to the determinant of any SDDM matrix with constant probability in about $$n^2 \delta^{-2}$$ time. This is the first routine for graphs that outperforms general-purpose routines for computing determinants of arbitrary matrices. We also give an algorithm that generates in about $$n^2 \delta^{-2}$$ time a spanning tree of a weighted undirected graph from a distribution with total variation distance of $$\delta$$ from the $$w$$-uniform distribution . 
    more » « less
  3. We study smoothed analysis of distributed graph algorithms, focusing on the fundamental minimum spanning tree (MST) problem. With the goal of studying the time complexity of distributed MST as a function of the "perturbation" of the input graph, we posit a smoothing model that is parameterized by a smoothing parameter 0 ≤ ϵ(n) ≤ 1 which controls the amount of random edges that can be added to an input graph G per round. Informally, ϵ(n) is the probability (typically a small function of n, e.g., n--¼) that a random edge can be added to a node per round. The added random edges, once they are added, can be used (only) for communication. We show upper and lower bounds on the time complexity of distributed MST in the above smoothing model. We present a distributed algorithm that, with high probability, 1 computes an MST and runs in Õ(min{1/√ϵ(n)2O(√log n), D+ √n}) rounds2 where ϵ is the smoothing parameter, D is the network diameter and n is the network size. To complement our upper bound, we also show a lower bound of Ω(min{1/√ϵ(n), D + √n}). We note that the upper and lower bounds essentially match except for a multiplicative 2O(√log n) polylog(n) factor. Our work can be considered as a first step in understanding the smoothed complexity of distributed graph algorithms. 
    more » « less
  4. Czumaj, Arturr; Dawar, Anuj; Merelli, Emanuela (Ed.)
    Robust optimization is a widely studied area in operations research, where the algorithm takes as input a range of values and outputs a single solution that performs well for the entire range. Specifically, a robust algorithm aims to minimize regret, defined as the maximum difference between the solution’s cost and that of an optimal solution in hindsight once the input has been realized. For graph problems in P, such as shortest path and minimum spanning tree, robust polynomial-time algorithms that obtain a constant approximation on regret are known. In this paper, we study robust algorithms for minimizing regret in NP-hard graph optimization problems, and give constant approximations on regret for the classical traveling salesman and Steiner tree problems. 
    more » « less
  5. To study the sensing mechanism of bat's biosonar system, we propose a fast simulation algorithm to generate natural-looking trees and forest---the primary living habitat of bats. We adopt 3D Lindenmayer system to create the fractal geometry of the trees, and add additional parameters, both globally and locally, to enable random variations of the tree structures. Random forest is then formed by placing simulated trees at random locations of a field according to a spatial point process. By employing a single algorithmic model with different numeric parameters, we can rapidly simulate 3D virtual environments with a wide variety of trees, producing detailed geometry of the foliage such as the leaf locations, sizes, and orientations. Written in C++ and visualized with openGL, our algorithm is fast to implement, easily parallable, and more adaptive to real-time visualization compared with existing alternative approaches. Our simulated environment can be used for general purposes such as studying new sensors or training remote sensing algorithms. 
    more » « less