skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Tiny Pointers
This paper introduces a new data-structural object that we call the tiny pointer. In many applications, traditional\(\log n\)-bit pointers can be replaced with\(o(\log n)\)-bit tiny pointers at the cost of only a constant-factor time overhead and a small probability of failure. We develop a comprehensive theory of tiny pointers, and give optimal constructions for both fixed-size tiny pointers (i.e., settings in which all of the tiny pointers must be the same size) and variable-size tiny pointers (i.e., settings in which the average tiny-pointer size must be small, but some tiny pointers can be larger). If a tiny pointer references an item in an array filled to load factor\(1-\delta\), then the optimal tiny-pointer size is\(\Theta(\log\log\log n+\log\delta^{-1})\)bits in the fixed-size case, and\(\Theta(\log\delta^{-1})\)expected bits in the variable-size case. Our tiny-pointer constructions also require us to revisit several classic problems having to do with balls and bins; these results may be of independent interest. Using tiny pointers, we apply tiny pointers to five classic data-structure problems. We show that:A data structure storing\(n\)\(v\)-bit values for\(n\)keys with constant-factor time modifications/queries can be implemented to take space\(nv+O(n\log^{(r)}n)\)bits, for any constant\(r\gt0\), as long as the user stores a tiny pointer of expected size\(O(1)\)with each key—here,\(\log^{(r)}n\)is the\(r\)-th iterated logarithm.Any binary search tree can be made succinct, meaning that it achieves\((1+o(1))\)times the optimal space, with constant-factor time overhead, and can even be made to be within\(O(n)\)bits of optimal if we allow for\(O(\log^{*}n)\)-time modifications—this holds even for rotation-based trees such as the splay tree and the red-black tree.Any fixed-capacity key-value dictionary can be made stable (i.e., items do not move once inserted) with constant-factor time overhead and\((1+o(1))\)-factor space overhead.Any key-value dictionary that requires uniform-size values can be made to support arbitrary-size values with constant-factor time overhead and with an additional space consumption of\(\log^{(r)}n+O(\log j)\)bits per\(j\)-bit value for an arbitrary constant\(r\gt0\)of our choice.Given an external-memory array\(A\)of size\((1+\varepsilon)n\)containing a dynamic set of up to\(n\)key-value pairs, it is possible to maintain an internal-memory stash of size\(O(n\log\varepsilon^{-1})\)bits so that the location of any key-value pair in\(A\)can be computed in constant time (and with no IOs). In each case tiny pointers allow for us to take a natural space-inefficient solution that uses pointers and make it space-efficient for free.  more » « less
Award ID(s):
2247577 2106827
PAR ID:
10575816
Author(s) / Creator(s):
; ; ; ;
Publisher / Repository:
ACM
Date Published:
Journal Name:
ACM Transactions on Algorithms
ISSN:
1549-6325
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We prove novel algorithmic guarantees for several online problems in the smoothed analysis model. In this model, at each time step an adversary chooses an input distribution with density function bounded above pointwise by \(\tfrac{1}{\sigma }\)times that of the uniform distribution; nature then samples an input from this distribution. Here, σ is a parameter that interpolates between the extremes of worst-case and average case analysis. Crucially, our results hold foradaptiveadversaries that can base their choice of input distribution on the decisions of the algorithm and the realizations of the inputs in the previous time steps. An adaptive adversary can nontrivially correlate inputs at different time steps with each other and with the algorithm’s current state; this appears to rule out the standard proof approaches in smoothed analysis. This paper presents a general technique for proving smoothed algorithmic guarantees against adaptive adversaries, in effect reducing the setting of an adaptive adversary to the much simpler case of an oblivious adversary (i.e., an adversary that commits in advance to the entire sequence of input distributions). We apply this technique to prove strong smoothed guarantees for three different problems:(1)Online learning: We consider the online prediction problem, where instances are generated from an adaptive sequence of σ-smooth distributions and the hypothesis class has VC dimensiond. We bound the regret by\(\tilde{O}(\sqrt {T d\ln (1/\sigma)} + d\ln (T/\sigma))\)and provide a near-matching lower bound. Our result shows that under smoothed analysis, learnability against adaptive adversaries is characterized by the finiteness of the VC dimension. This is as opposed to the worst-case analysis, where online learnability is characterized by Littlestone dimension (which is infinite even in the extremely restricted case of one-dimensional threshold functions). Our results fully answer an open question of Rakhlin et al. [64].(2)Online discrepancy minimization: We consider the setting of the online Komlós problem, where the input is generated from an adaptive sequence of σ-smooth and isotropic distributions on the ℓ2unit ball. We bound the ℓnorm of the discrepancy vector by\(\tilde{O}(\ln ^2(\frac{nT}{\sigma }))\). This is as opposed to the worst-case analysis, where the tight discrepancy bound is\(\Theta (\sqrt {T/n})\). We show such\(\mathrm{polylog}(nT/\sigma)\)discrepancy guarantees are not achievable for non-isotropic σ-smooth distributions.(3)Dispersion in online optimization: We consider online optimization with piecewise Lipschitz functions where functions with ℓ discontinuities are chosen by a smoothed adaptive adversary and show that the resulting sequence is\(({\sigma }/{\sqrt {T\ell }}, \tilde{O}(\sqrt {T\ell }))\)-dispersed. That is, every ball of radius\({\sigma }/{\sqrt {T\ell }}\)is split by\(\tilde{O}(\sqrt {T\ell })\)of the partitions made by these functions. This result matches the dispersion parameters of Balcan et al. [13] for oblivious smooth adversaries, up to logarithmic factors. On the other hand, worst-case sequences are trivially (0,T)-dispersed.1 
    more » « less
  2. On hypergraphs withmhyperedges andnvertices, wherepdenotes the total size of the hyperedges, we provide the following results:We give an algorithm that runs in\(\widetilde{O}(mn^{2k-2})\)time for finding a minimumk-cut in hypergraphs of arbitrary rank. This algorithm betters the previous best running time for the minimumk-cut problem, fork> 2.We give an algorithm that runs in\(\widetilde{O}(n^{\max \lbrace r,2k-2\rbrace })\)time for finding a minimumk-cut in hypergraphs of constant rankr. This algorithm betters the previous best running times for both the minimum cut and minimumk-cut problems for dense hypergraphs.Both of our algorithms are Monte Carlo, i.e., they return a minimumk-cut (or minimum cut) with high probability. These algorithms are obtained as instantiations of a genericbranching randomized contractiontechnique on hypergraphs, which extends the celebrated work of Karger and Stein on recursive contractions in graphs. Our techniques and results also extend to the problems of minimum hedge-cut and minimum hedge-k-cut on hedgegraphs, which generalize hypergraphs. 
    more » « less
  3. Given a weighted, ordered query set\(Q\)and a partition of\(Q\)into classes, we study the problem of computing a minimum-cost decision tree that, given any query\(q\in Q\), uses equality tests and less-than tests to determine\(q\)'s class. Such a tree can be faster and smaller than a conventional search tree and smaller than a lookup table (both of which must identify\(q\), not just its class). We give the first polynomial-time algorithm for the problem. The algorithm extends naturally to the setting where each query has multiple allowed classes. 
    more » « less
  4. Stream processing, which involves real-time computation of data as it is created or received, is vital for various applications, specifically wireless communication. The evolving protocols, the requirement for high-throughput, and the challenges of handling diverse processing patterns make it demanding. Traditional platforms grapple with meeting real-time throughput and latency requirements due to large data volume, sequential and indeterministic data arrival, and variable data rates, leading to inefficiencies in memory access and parallel processing. We present Canalis, a throughput-optimized framework designed to address these challenges, ensuring high-performance while achieving low energy consumption. Canalis is a hardware-software co-designed system. It includes a programmable spatial architecture, Flux Stream Processing Unit (FluxSPU), proposed by this work to enhance data throughput and energy efficiency. FluxSPU is accompanied by a software stack that eases the programming process. We evaluated Canalis with eight distinct benchmarks. When compared to CPU and GPU in mobile SoC to demonstrate the effectiveness of domain specialization, Canalis achieves an average speedup of 13.4\(\times\)and 6.6\(\times\), and energy savings of 189.8\(\times\)and 283.9\(\times\), respectively. In contrast to equivalent ASICs of the benchmarks, the average energy overhead of Canalis is within 2.4\(\times\), successfully maintaining generalizations without incurring significant overhead. 
    more » « less
  5. This paper introduces a new data-structural object that we call the tiny pointer. In many applications, traditional log n-bit pointers can be replaced with o(log n)-bit tiny pointers at the cost of only a constant-factor time overhead and a small probability of failure. We develop a comprehensive theory of tiny pointers, and give optimal constructions for both fixed-size tiny pointers (i.e., settings in which all of the tiny pointers must be the same size) and variable-size tiny pointers (i.e., settings in which the average tiny-pointer size must be small, but some tiny pointers can be larger). If a tiny pointer references an element in an array filled to load factor 1 — δ, then the optimal tiny-pointer size is Θ(log log log n + log δ-1) bits in the fixed-size case, and Θ(log δ-1) expected bits in the variable-size case. Our tiny-pointer constructions also require us to revisit several classic problems having to do with balls and bins; these results may be of independent interest. Using tiny pointers, we revisit five classic data-structure problems. We show that: • A data structure storing n v-bit values for n keys with constant-time modifications/queries can be implemented to take space nv + O(n log(r) n) bits, for any constant r > 0, as long as the user stores a tiny pointer of expected size O(1) with each key—here, log(r) n is the r-th iterated logarithm. • Any binary search tree can be made succinct with constant-factor time overhead, and can even be made to be within O(n) bits of optimal if we allow for O(log* n)-time modifications—this holds even for rotation-based trees such as the splay tree and the red-black tree. • Any fixed-capacity key-value dictionary can be made stable (i.e., items do not move once inserted) with constant-time overhead and 1 + o(1) space overhead. • Any key-value dictionary that requires uniform-size values can be made to support arbitrary-size values with constant-time overhead and with an additional space consumption of log(r) n + O(log j) bits per j-bit value for an arbitrary constant r > 0 of our choice. • Given an external-memory array A of size (1 + ε)n containing a dynamic set of up to n key-value pairs, it is possible to maintain an internal-memory stash of size O(n log ε-1) bits so that the location of any key-value pair in A can be computed in constant time (and with no IOs). These are all well studied and classic problems, and in each case tiny pointers allow for us to take a natural space-inefficient solution that uses pointers and make it space-efficient for free. 
    more » « less