skip to main content


Title: The hit-and-run version of top-to-random
Abstract We study an example of a hit-and-run random walk on the symmetric group $\mathbf S_n$ . Our starting point is the well-understood top-to-random shuffle. In the hit-and-run version, at each single step , after picking the point of insertion j uniformly at random in $\{1,\ldots,n\}$ , the top card is inserted in the j th position k times in a row, where k is uniform in $\{0,1,\ldots,j-1\}$ . The question is, does this accelerate mixing significantly or not? We show that, in $L^2$ and sup-norm, this accelerates mixing at most by a constant factor (independent of n ). Analyzing this problem in total variation is an interesting open question. We show that, in general, hit-and-run random walks on finite groups have non-negative spectrum.  more » « less
Award ID(s):
1645643
NSF-PAR ID:
10347763
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Journal of Applied Probability
ISSN:
0021-9002
Page Range / eLocation ID:
1 to 20
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Buchin, Kevin ; Colin de Verdi\` (Ed.)
    The Gibbs Sampler is a general method for sampling high-dimensional distributions, dating back to 1971. In each step of the Gibbs Sampler, we pick a random coordinate and re-sample that coordinate from the distribution induced by fixing all the other coordinates. While it has become widely used over the past half-century, guarantees of efficient convergence have been elusive. We show that for a convex body K in ℝⁿ with diameter D, the mixing time of the Coordinate Hit-and-Run (CHAR) algorithm on K is polynomial in n and D. We also give a lower bound on the mixing rate of CHAR, showing that it is strictly worse than hit-and-run and the ball walk in the worst case. 
    more » « less
  2. Abstract

    Sequence mappability is an important task in genome resequencing. In the (km)-mappability problem, for a given sequenceTof lengthn, the goal is to compute a table whoseith entry is the number of indices$$j \ne i$$jisuch that the length-msubstrings ofTstarting at positionsiandjhave at mostkmismatches. Previous works on this problem focused on heuristics computing a rough approximation of the result or on the case of$$k=1$$k=1. We present several efficient algorithms for the general case of the problem. Our main result is an algorithm that, for$$k=O(1)$$k=O(1), works in$$O(n)$$O(n)space and, with high probability, in$$O(n \cdot \min \{m^k,\log ^k n\})$$O(n·min{mk,logkn})time. Our algorithm requires a careful adaptation of thek-errata trees of Cole et al. [STOC 2004] to avoid multiple counting of pairs of substrings. Our technique can also be applied to solve the all-pairs Hamming distance problem introduced by Crochemore et al. [WABI 2017]. We further develop$$O(n^2)$$O(n2)-time algorithms to computeall(km)-mappability tables for a fixedmand all$$k\in \{0,\ldots ,m\}$$k{0,,m}or a fixedkand all$$m\in \{k,\ldots ,n\}$$m{k,,n}. Finally, we show that, for$$k,m = \Theta (\log n)$$k,m=Θ(logn), the (km)-mappability problem cannot be solved in strongly subquadratic time unless the Strong Exponential Time Hypothesis fails. This is an improved and extended version of a paper presented at SPIRE 2018.

     
    more » « less
  3. Abstract We study the extent to which divisors of a typical integer n are concentrated. In particular, defining $$\Delta (n) := \max _t \# \{d | n, \log d \in [t,t+1]\}$$ Δ ( n ) : = max t # { d | n , log d ∈ [ t , t + 1 ] } , we show that $$\Delta (n) \geqslant (\log \log n)^{0.35332277\ldots }$$ Δ ( n ) ⩾ ( log log n ) 0.35332277 … for almost all n , a bound we believe to be sharp. This disproves a conjecture of Maier and Tenenbaum. We also prove analogs for the concentration of divisors of a random permutation and of a random polynomial over a finite field. Most of the paper is devoted to a study of the following much more combinatorial problem of independent interest. Pick a random set $${\textbf{A}} \subset {\mathbb {N}}$$ A ⊂ N by selecting i to lie in $${\textbf{A}}$$ A with probability 1/ i . What is the supremum of all exponents $$\beta _k$$ β k such that, almost surely as $$D \rightarrow \infty $$ D → ∞ , some integer is the sum of elements of $${\textbf{A}} \cap [D^{\beta _k}, D]$$ A ∩ [ D β k , D ] in k different ways? We characterise $$\beta _k$$ β k as the solution to a certain optimisation problem over measures on the discrete cube $$\{0,1\}^k$$ { 0 , 1 } k , and obtain lower bounds for $$\beta _k$$ β k which we believe to be asymptotically sharp. 
    more » « less
  4. We develop a framework for sampling from discrete distributions $\mu$ on the hypercube $\{\pm 1\}^n$ by sampling from continuous distributions supported on $\mathbb{R}^n$ obtained by convolution with spherical Gaussians. We show that for well-studied families of discrete distributions $\mu$, convolving $\mu$ with Gaussians yields well-conditioned log-concave distributions, as long as the variance of the Gaussian is above an $O(1)$ threshold. We then reduce the task of sampling from $\mu$ to sampling from Gaussian-convolved distributions. Our reduction is based on a stochastic process widely studied under different names: backward diffusion in diffusion models, and stochastic localization. We discretize this process in a novel way that allows for high accuracy and parallelism. As our main application, we resolve open questions Anari, Hu, Saberi, and Schild raised on the parallel sampling of distributions that admit parallel counting. We show that determinantal point processes can be sampled via RNC algorithms, that is in time $\log(n)^{O(1)}$ using $n^{O(1)}$ processors. For a wider class of distributions, we show our framework yields Quasi-RNC sampling, i.e., $\log(n)^{O(1)}$ time using $n^{O(\log n)}$ processors. This wider class includes non-symmetric determinantal point processes and random Eulerian tours in digraphs, the latter nearly resolving another open question raised by prior work. Of potentially independent interest, we introduce and study a notion of smoothness for discrete distributions that we call transport stability, which we use to control the propagation of error in our framework. Additionally, we connect transport stability to constructions of optimally mixing local random walks and concentration inequalities. 
    more » « less
  5. Abstract

    Motivated by the Rudnick-Sarnak theorem we study limiting distribution of smoothed local correlations of the form$$\begin{aligned} \sum _{j_1, j_2, \ldots , j_n} f(N(\theta _{j_2}-\theta _{j_1}), N(\theta _{j_3}-\theta _{j_1}), \ldots , N(\theta _{j_n}-\theta _{j_1})) \end{aligned}$$j1,j2,,jnf(N(θj2-θj1),N(θj3-θj1),,N(θjn-θj1))for the Circular United Ensemble of random matrices for sufficiently smooth test functions.

     
    more » « less