DOI: 10.1145/3700594 ISSN: 1549-6325
Tiny Pointers
Michael A. Bender, Alex Conway, Martín Farach-Colton, William Kuszmaul, Guido Tagliavini
This paper introduces a new data-structural object that we call the tiny pointer. In many applications, traditional
\(\log n\)
-bit pointers can be replaced with
\(o(\log n)\)
-bit tiny pointers at the cost of only a constant-factor time overhead and a small probability of failure. We develop a comprehensive theory of tiny pointers, and give optimal constructions for both fixed-size tiny pointers (i.e., settings in which all of the tiny pointers must be the same size) and variable-size tiny pointers (i.e., settings in which the average tiny-pointer size must be small, but some tiny pointers can be larger). If a tiny pointer references an item in an array filled to load factor
\(1-\delta\)
, then the optimal tiny-pointer size is
\(\Theta(\log\log\log n+\log\delta^{-1})\)
bits in the fixed-size case, and
\(\Theta(\log\delta^{-1})\)
expected bits in the variable-size case.
Our tiny-pointer constructions also require us to revisit several classic problems having to do with balls and bins; these results may be of independent interest.
Using tiny pointers, we apply tiny pointers to five classic data-structure problems. We show that:
A data structure storing
\(n\)
\(v\)
-bit values for
\(n\)
keys with constant-factor time modifications/queries can be implemented to take space
\(nv+O(n\log^{(r)}n)\)
bits, for any constant
\(r\gt0\)
, as long as the user stores a tiny pointer of expected size
\(O(1)\)
with each key—here,
\(\log^{(r)}n\)
is the
\(r\)
-th iterated logarithm.
Any binary search tree can be made succinct, meaning that it achieves
\((1+o(1))\)
times the optimal space, with constant-factor time overhead, and can even be made to be within
\(O(n)\)
bits of optimal if we allow for
\(O(\log^{*}n)\)
-time modifications—this holds even for rotation-based trees such as the splay tree and the red-black tree.
Any fixed-capacity key-value dictionary can be made stable (i.e., items do not move once inserted) with constant-factor time overhead and
\((1+o(1))\)
-factor space overhead.
Any key-value dictionary that requires uniform-size values can be made to support arbitrary-size values with constant-factor time overhead and with an additional space consumption of
\(\log^{(r)}n+O(\log j)\)
bits per
\(j\)
-bit value for an arbitrary constant
\(r\gt0\)
of our choice.
Given an external-memory array
\(A\)
of size
\((1+\varepsilon)n\)
containing a dynamic set of up to
\(n\)
key-value pairs, it is possible to maintain an internal-memory stash of size
\(O(n\log\varepsilon^{-1})\)
bits so that the location of any key-value pair in
\(A\)
can be computed in constant time (and with no IOs).
In each case tiny pointers allow for us to take a natural space-inefficient solution that uses pointers and make it space-efficient for free.