skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Optimal Document Exchange and New Codes for Small Number of Insertions and Deletions
We give the first communication-optimal document exchange protocol. For any n and k  more » « less
Award ID(s):
1814603 1750808 1910588 1618280
PAR ID:
10121529
Author(s) / Creator(s):
Date Published:
Journal Name:
IEEE Symposium on Foundations of Computer Science
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Existing proofs that deduce BPP = P from circuit lower bounds convert randomized algorithms into deterministic algorithms with a large polynomial slowdown. We convert randomized algorithms into deterministic ones with little slowdown . Specifically, assuming exponential lower bounds against randomized NP ∩ coNP circuits, formally known as randomized SVN circuits, we convert any randomized algorithm over inputs of length n running in time t ≥ n into a deterministic one running in time t 2+α for an arbitrarily small constant α > 0. Such a slowdown is nearly optimal for t close to n , since under standard complexity-theoretic assumptions, there are problems with an inherent quadratic derandomization slowdown. We also convert any randomized algorithm that errs rarely into a deterministic algorithm having a similar running time (with pre-processing). The latter derandomization result holds under weaker assumptions, of exponential lower bounds against deterministic SVN circuits. Our results follow from a new, nearly optimal, explicit pseudorandom generator fooling circuits of size s with seed length (1+α)log s , under the assumption that there exists a function f ∈ E that requires randomized SVN circuits of size at least 2 (1-α′) n , where α = O (α)′. The construction uses, among other ideas, a new connection between pseudoentropy generators and locally list recoverable codes. 
    more » « less
  2. This paper introduces a new data-structural object that we call the tiny pointer. In many applications, traditional log n-bit pointers can be replaced with o(log n)-bit tiny pointers at the cost of only a constant-factor time overhead and a small probability of failure. We develop a comprehensive theory of tiny pointers, and give optimal constructions for both fixed-size tiny pointers (i.e., settings in which all of the tiny pointers must be the same size) and variable-size tiny pointers (i.e., settings in which the average tiny-pointer size must be small, but some tiny pointers can be larger). If a tiny pointer references an element in an array filled to load factor 1 — δ, then the optimal tiny-pointer size is Θ(log log log n + log δ-1) bits in the fixed-size case, and Θ(log δ-1) expected bits in the variable-size case. Our tiny-pointer constructions also require us to revisit several classic problems having to do with balls and bins; these results may be of independent interest. Using tiny pointers, we revisit five classic data-structure problems. We show that: • A data structure storing n v-bit values for n keys with constant-time modifications/queries can be implemented to take space nv + O(n log(r) n) bits, for any constant r > 0, as long as the user stores a tiny pointer of expected size O(1) with each key—here, log(r) n is the r-th iterated logarithm. • Any binary search tree can be made succinct with constant-factor time overhead, and can even be made to be within O(n) bits of optimal if we allow for O(log* n)-time modifications—this holds even for rotation-based trees such as the splay tree and the red-black tree. • Any fixed-capacity key-value dictionary can be made stable (i.e., items do not move once inserted) with constant-time overhead and 1 + o(1) space overhead. • Any key-value dictionary that requires uniform-size values can be made to support arbitrary-size values with constant-time overhead and with an additional space consumption of log(r) n + O(log j) bits per j-bit value for an arbitrary constant r > 0 of our choice. • Given an external-memory array A of size (1 + ε)n containing a dynamic set of up to n key-value pairs, it is possible to maintain an internal-memory stash of size O(n log ε-1) bits so that the location of any key-value pair in A can be computed in constant time (and with no IOs). These are all well studied and classic problems, and in each case tiny pointers allow for us to take a natural space-inefficient solution that uses pointers and make it space-efficient for free. 
    more » « less
  3. For nearly six decades, the central open question in the study of hash tables has been to determine the optimal achievable tradeoff curve between time and space. State-of-the-art hash tables offer the following guarantee: If keys/values are Θ(logn) bits each, then it is possible to achieve constant-time insertions/deletions/queries while wasting only O(loglogn) bits of space per key when compared to the information-theoretic optimum. Even prior to this bound being achieved, the target of O(log log n) wasted bits per key was known to be a natural end goal, and was proven to be optimal for a number of closely related problems (e.g., stable hashing, dynamic retrieval, and dynamically-resized filters). This paper shows that O(log log n) wasted bits per key is not the end of the line for hashing. In fact, for any k ∈ [log∗ n], it is possible to achieve O(k)-time insertions/deletions, O(1)-time queries, and O(log(k) n) = Ologlog···logn 􏰟 􏰞􏰝 􏰠 k wasted bits per key (all with high probability in n). This means that, each time we increase inser- tion/deletion time by an additive constant, we reduce the wasted bits per key exponentially. We further show that this tradeoff curve is the best achievable by any of a large class of hash tables, including any hash table designed using the current framework for making constant-time hash tables succinct. Our results hold not just for fixed-capacity hash tables, but also for hash tables that are dynamically resized (this is a fundamental departure from what is possible for filters); and for hash tables that store very large keys/values, each of which can be up to no(1) bits (this breaks with the conventional wisdom that larger keys/values should lead to more wasted bits per key). For very small keys/values, we are able to tighten our bounds to o(1) wasted bits per key, even when k = O(1). Building on this, we obtain a constant-time dynamic filter that uses n􏰕logε−1􏰖+nloge+o(n) bits of space for a wide choice of 
    more » « less
  4. Dynamic trees are a well-studied and fundamental building block of dynamic graph algorithms dating back to the seminal work of Sleator and Tarjan [STOC'81, (1981), pp. 114-122]. The problem is to maintain a tree subject to online edge insertions and deletions while answering queries about the tree, such as the heaviest weight on a path, etc. In the parallel batch-dynamic setting, the goal is to process batches of edge updates work efficiently in low (polylog n) span. Two work-efficient algorithms are known: batch-parallel Euler Tour Trees by Tseng et al. [ALENEX'19, (2019), pp. 92--106] and parallel Rake-Compress (RC) Trees by Acar et al. [ESA'20, (2020), pp. 2:1--2:23]. Both however are randomized and work efficient in expectation. Several downstream results that use these data structures (and indeed to the best of our knowledge, all known work-efficient parallel batch-dynamic graph algorithms) are therefore also randomized. In this work, we give the first deterministic work-efficient solution to the problem. Our algorithm maintains a parallel RC-Tree on n vertices subject to batches of k edge updates deterministically in worst-case O(k log(1 + n/k)) work and O(log n loglog k) span on the Common-CRCW PRAM. We also show how to improve the span of the randomized algorithm from O(log n log* n) to O(log n). Lastly, as a result of our new deterministic algorithm, we also derandomize several downstream results that make use of parallel batch-dynamic dynamic trees, previously for which the only efficient solutions were randomized. 
    more » « less
  5. Given a set $$P$$ of $$n$$ points in the plane, we consider the problem of computing the number of points of $$P$$ in a query unit disk (i.e., all query disks have the same radius). We show that the main techniques for simplex range searching in the plane can be adapted to this problem. For example, by adapting Matoušek's results, we can build a data structure of $O(n)$ space in $$O(n^{1+\delta})$$ time (for any $$\delta>0$$) so that each query can be answered in $$O(\sqrt{n})$$ time; alternatively, we can build a data structure of $$O(n^2/\log^2 n)$$ space with $$O(n^{1+\delta})$$ preprocessing time (for any $$\delta>0$$) and $$O(\log n)$$ query time. Our techniques lead to improvements for several other classical problems in computational geometry. 1. Given a set of $$n$$ unit disks and a set of $$n$$ points in the plane, the batched unit-disk range counting problem is to compute for each disk the number of points in it. Previous work [Katz and Sharir, 1997] solved the problem in $$O(n^{4/3}\log n)$$ time. We give a new algorithm of $$O(n^{4/3})$$ time, which is optimal as it matches an $$\Omega(n^{4/3})$$-time lower bound. For small $$\chi$$, where $$\chi$$ is the number of pairs of unit disks that intersect, we further improve the algorithm to $$O(n^{2/3}\chi^{1/3}+n^{1+\delta})$$ time, for any $$\delta>0$$. 2. The above result immediately leads to an $$O(n^{4/3})$$ time optimal algorithm for counting the intersecting pairs of circles for a set of $$n$$ unit circles in the plane. The previous best algorithms solve the problem in $$O(n^{4/3}\log n)$$ deterministic time [Katz and Sharir, 1997] or in $$O(n^{4/3}\log^{2/3} n)$$ expected time by a randomized algorithm [Agarwal, Pellegrini, and Sharir, 1993]. 3. Given a set $$P$$ of $$n$$ points in the plane and an integer $$k$$, the distance selection problem is to find the $$k$$-th smallest distance among all pairwise distances of $$P$$. The problem can be solved in $$O(n^{4/3}\log^2 n)$$ deterministic time [Katz and Sharir, 1997] or in $$O(n\log n+n^{2/3}k^{1/3}\log^{5/3}n)$$ expected time by a randomized algorithm [Chan, 2001]. Our new randomized algorithm runs in $$O(n\log n +n^{2/3}k^{1/3}\log n)$$ expected time. 4. Given a set $$P$$ of $$n$$ points in the plane, the discrete $$2$$-center problem is to compute two smallest congruent disks whose centers are in $$P$$ and whose union covers $$P$$. An $$O(n^{4/3}\log^5 n)$$-time algorithm was known [Agarwal, Sharir, and Welzl, 1998]. Our techniques yield a deterministic algorithm of $$O(n^{4/3}\log^{10/3} n\cdot (\log\log n)^{O(1)})$$ time and a randomized algorithm of $$O(n^{4/3}\log^3 n\cdot (\log\log n)^{1/3})$$ expected time. 
    more » « less