DOI: 10.1145/3700594 ISSN: 1549-6325

Tiny Pointers

Michael A. Bender, Alex Conway, Martín Farach-Colton, William Kuszmaul, Guido Tagliavini

This paper introduces a new data-structural object that we call the tiny pointer. In many applications, traditional \(\log n\) -bit pointers can be replaced with \(o(\log n)\) -bit tiny pointers at the cost of only a constant-factor time overhead and a small probability of failure. We develop a comprehensive theory of tiny pointers, and give optimal constructions for both fixed-size tiny pointers (i.e., settings in which all of the tiny pointers must be the same size) and variable-size tiny pointers (i.e., settings in which the average tiny-pointer size must be small, but some tiny pointers can be larger). If a tiny pointer references an item in an array filled to load factor \(1-\delta\) , then the optimal tiny-pointer size is \(\Theta(\log\log\log n+\log\delta^{-1})\) bits in the fixed-size case, and \(\Theta(\log\delta^{-1})\) expected bits in the variable-size case.

Our tiny-pointer constructions also require us to revisit several classic problems having to do with balls and bins; these results may be of independent interest.

Using tiny pointers, we apply tiny pointers to five classic data-structure problems. We show that:

A data structure storing \(n\) \(v\) -bit values for \(n\) keys with constant-factor time modifications/queries can be implemented to take space \(nv+O(n\log^{(r)}n)\) bits, for any constant \(r\gt0\) , as long as the user stores a tiny pointer of expected size \(O(1)\) with each key—here, \(\log^{(r)}n\) is the \(r\) -th iterated logarithm.

Any binary search tree can be made succinct, meaning that it achieves \((1+o(1))\) times the optimal space, with constant-factor time overhead, and can even be made to be within \(O(n)\) bits of optimal if we allow for \(O(\log^{*}n)\) -time modifications—this holds even for rotation-based trees such as the splay tree and the red-black tree.

Any fixed-capacity key-value dictionary can be made stable (i.e., items do not move once inserted) with constant-factor time overhead and \((1+o(1))\) -factor space overhead.

Any key-value dictionary that requires uniform-size values can be made to support arbitrary-size values with constant-factor time overhead and with an additional space consumption of \(\log^{(r)}n+O(\log j)\) bits per \(j\) -bit value for an arbitrary constant \(r\gt0\) of our choice.

Given an external-memory array \(A\) of size \((1+\varepsilon)n\) containing a dynamic set of up to \(n\) key-value pairs, it is possible to maintain an internal-memory stash of size \(O(n\log\varepsilon^{-1})\) bits so that the location of any key-value pair in \(A\) can be computed in constant time (and with no IOs).

In each case tiny pointers allow for us to take a natural space-inefficient solution that uses pointers and make it space-efficient for free.

More from our Archive