skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Sharp Noisy Binary Search with Monotonic Probabilities
We revisit the noisy binary search model of [Karp and Kleinberg, 2007], in which we have n coins with unknown probabilities p_i that we can flip. The coins are sorted by increasing p_i, and we would like to find where the probability crosses (to within ε) of a target value τ. This generalized the fixed-noise model of [Burnashev and Zigangirov, 1974], in which p_i = 1/2 ± ε, to a setting where coins near the target may be indistinguishable from it. It was shown in [Karp and Kleinberg, 2007] that Θ(1/ε² log n) samples are necessary and sufficient for this task. We produce a practical algorithm by solving two theoretical challenges: high-probability behavior and sharp constants. We give an algorithm that succeeds with probability 1-δ from 1/C_{τ, ε} ⋅ (log₂ n + O(log^{2/3} n log^{1/3} 1/(δ) + log 1/(δ))) samples, where C_{τ, ε} is the optimal such constant achievable. For δ > n^{-o(1)} this is within 1 + o(1) of optimal, and for δ ≪ 1 it is the first bound within constant factors of optimal.  more » « less
Award ID(s):
2008868 2231095 2505865
PAR ID:
10548538
Author(s) / Creator(s):
;
Editor(s):
Bringmann, Karl; Grohe, Martin; Puppis, Gabriele; Svensson, Ola
Publisher / Repository:
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Date Published:
Volume:
297
ISSN:
1868-8969
ISBN:
978-3-95977-322-5
Page Range / eLocation ID:
297-297
Subject(s) / Keyword(s):
fine-grained algorithms randomized/probabilistic methods sublinear/streaming algorithms noisy binary search Theory of computation → Randomness, geometry and discrete structures Theory of computation → Streaming, sublinear and near linear time algorithms Theory of computation → Lower bounds and information complexity
Format(s):
Medium: X Size: 19 pages; 953042 bytes Other: application/pdf
Size(s):
19 pages 953042 bytes
Right(s):
Creative Commons Attribution 4.0 International license; info:eu-repo/semantics/openAccess
Sponsoring Org:
National Science Foundation
More Like this
  1. Censor-Hillel, Keren; Grandoni, Fabrizio; Ouaknine, Joel; Puppis, Gabriele (Ed.)
    We study the problem of indexing a text T[1..n] to support pattern matching with wildcards. The input of a query is a pattern P[1..m] containing h ∈ [0, k] wildcard (a.k.a. don't care) characters and the output is the set of occurrences of P in T (i.e., starting positions of substrings of T that matches P), where k = o(log n) is fixed at index construction. A classic solution by Cole et al. [STOC 2004] provides an index with space complexity O(n ⋅ (clog n)^k/k!)) and query time O(m+2^h log log n+occ), where c > 1 is a constant, and occ denotes the number of occurrences of P in T. We introduce a new data structure that significantly reduces space usage for highly repetitive texts while maintaining efficient query processing. Its space (in words) and query time are as follows: O(δ log (n/δ)⋅ c^k (1+(log^k (δ log n))/k!)) and O((m+2^h +occ)log n)) The parameter δ, known as substring complexity, is a recently introduced measure of repetitiveness that serves as a unifying and lower-bounding metric for several popular measures, including the number of phrases in the LZ77 factorization (denoted by z) and the number of runs in the Burrows-Wheeler Transform (denoted by r). Moreover, O(δ log (n/δ)) represents the optimal space required to encode the data in terms of n and δ, helping us see how close our space is to the minimum required. In another trade-off, we match the query time of Cole et al.’s index using O(n+δ log (n/δ) ⋅ (clogδ)^{k+ε}/k!) space, where ε > 0 is an arbitrarily small constant. We also demonstrate how these techniques can be applied to a more general indexing problem, where the query pattern includes k-gaps (a gap can be interpreted as a contiguous sequence of wildcard characters). 
    more » « less
  2. Kumar, Amit; Ron-Zewi, Noga (Ed.)
    We study the Matrix Multiplication Verification Problem (MMV) where the goal is, given three n × n matrices A, B, and C as input, to decide whether AB = C. A classic randomized algorithm by Freivalds (MFCS, 1979) solves MMV in Õ(n²) time, and a longstanding challenge is to (partially) derandomize it while still running in faster than matrix multiplication time (i.e., in o(n^ω) time). To that end, we give two algorithms for MMV in the case where AB - C is sparse. Specifically, when AB - C has at most O(n^δ) non-zero entries for a constant 0 ≤ δ < 2, we give (1) a deterministic O(n^(ω-ε))-time algorithm for constant ε = ε(δ) > 0, and (2) a randomized Õ(n²)-time algorithm using δ/2 ⋅ log₂ n + O(1) random bits. The former algorithm is faster than the deterministic algorithm of Künnemann (ESA, 2018) when δ ≥ 1.056, and the latter algorithm uses fewer random bits than the algorithm of Kimbrel and Sinha (IPL, 1993), which runs in the same time and uses log₂ n + O(1) random bits (in turn fewer than Freivalds’s algorithm). Our algorithms are simple and use techniques from coding theory. Let H be a parity-check matrix of a Maximum Distance Separable (MDS) code, and let G = (I | G') be a generator matrix of a (possibly different) MDS code in systematic form. Our deterministic algorithm uses fast rectangular matrix multiplication to check whether HAB = HC and H(AB)^T = H(C^T), and our randomized algorithm samples a uniformly random row g' from G' and checks whether g'AB = g'C and g'(AB)^T = g'C^T. We additionally study the complexity of MMV. We first show that all algorithms in a natural class of deterministic linear algebraic algorithms for MMV (including ours) require Ω(n^ω) time. We also show a barrier to proving a super-quadratic running time lower bound for matrix multiplication (and hence MMV) under the Strong Exponential Time Hypothesis (SETH). Finally, we study relationships between natural variants and special cases of MMV (with respect to deterministic Õ(n²)-time reductions). 
    more » « less
  3. Abstract In this paper, we consider the problem of noiseless non-adaptive probabilistic group testing, in which the goal is high-probability recovery of the defective set. We show that in the case of $$n$$ items among which $$k$$ are defective, the smallest possible number of tests equals $$\min \{ C_{k,n} k \log n, n\}$$ up to lower-order asymptotic terms, where $$C_{k,n}$$ is a uniformly bounded constant (varying depending on the scaling of $$k$$ with respect to $$n$$) with a simple explicit expression. The algorithmic upper bound follows from a minor adaptation of an existing analysis of the Definite Defectives algorithm, and the algorithm-independent lower bound builds on existing works for the regimes $$k \le n^{1-\varOmega (1)}$$ and $$k = \varTheta (n)$$. In sufficiently sparse regimes (including $$k = o\big ( \frac{n}{\log n} \big )$$), our main result generalizes that of Coja-Oghlan et al. (2020) by avoiding the assumption $$k \le n^{1-\varOmega (1)}$$, whereas in sufficiently dense regimes (including $$k = \omega \big ( \frac{n}{\log n} \big )$$), our main result shows that individual testing is asymptotically optimal for any non-zero target success probability, thus strengthening an existing result of Aldridge (2019, IEEE Trans. Inf. Theory, 65, 2058–2061) in terms of both the error probability and the assumed scaling of $$k$$. 
    more » « less
  4. Meka, Raghu (Ed.)
    We introduce Online Balanced Allocation of Dynamic Components (OBADC), a problem motivated by the practical challenge of dynamic resource allocation for large-scale distributed applications. In OBADC, we need to allocate a dynamic set of at most k𝓁 vertices (representing processes) in 𝓁 > 0 clusters. We consider an over-provisioned setup in which each cluster can hold at most k(1+ε) vertices, for an arbitrary constant ε > 0. The communication requirements among the vertices are modeled by the notion of a dynamically changing component, which is a subset of vertices that need to be co-located in the same cluster. At each time t, a request r_t of one of the following types arrives: 1) insertion of a vertex v forming a singleton component v at unit cost. 2) merge of (u,v) requiring that the components containing u and v be merged and co-located thereafter. 3) deletion of an existing vertex v at zero cost. Before serving any request, an algorithm can migrate vertices from one cluster to another, at a unit migration cost per vertex. We seek an online algorithm to minimize the total migration cost incurred for an arbitrary request sequence σ = (r_t)_{t > 0}, while simultaneously minimizing the number of clusters utilized. We analyze competitiveness with respect to an optimal clairvoyant offline algorithm with identical (over-provisioned) capacity constraints. We give an O(log k)-competitive algorithm for OBADC, and a matching lower-bound. The number of clusters utilized by our algorithm is always within a (2+ε) factor of the minimum. Furthermore, in a resource augmented setting where the optimal offline algorithm is constrained to capacity k per cluster, our algorithm obtains O(log k) competitiveness and utilizes a number of clusters within (1+ε) factor of the minimum. We also consider OBADC in the context of machine-learned predictions, where for each newly inserted vertex v at time t: i) with probability η > 0, the set of vertices (that exist at time t) in the component of v is revealed and, ii) with probability 1-η, no information is revealed. For OBADC with predictions, we give a O(1)-consistent and O(min(log 1/(η), log k))-robust algorithm. 
    more » « less
  5. We present O(log logn)-round algorithms in the Massively Parallel Computation (MPC) model, with ˜O(n) memory per machine, that compute a maximal independent set, a 1 + ε approximation of maximum matching, and a 2 + ε approximation of minimum vertex cover, for any n-vertex graph and any constant ε > 0. These improve the state of the art as follows: • Our MIS algorithm leads to a simple O(log log Δ)-round MIS algorithm in the CONGESTED-CLIQUE model of distributed computing, which improves on the ˜O (plog Δ)-round algorithm of Ghaffari [PODC’17]. • OurO(log logn)-round (1+ε)-approximate maximum matching algorithm simplifies or improves on the following prior work: O(log2 logn)-round (1 + ε)-approximation algorithm of Czumaj et al. [STOC’18] and O(log logn)-round (1 + ε)- approximation algorithm of Assadi et al. [arXiv’17]. • Our O(log logn)-round (2+ε)-approximate minimum vertex cover algorithm improves on an O(log logn)-round O(1)- approximation of Assadi et al. [arXiv’17]. 
    more » « less