Key points are not available for this paper at this time.
(MATH) A vector A of length N is defined implicitly, via a stream of updates of the form "add 5 to A3. " We give a sketching algorithm, that constructs a small sketch from the stream of updates, and a reconstruction algorithm, that produces a B-bucket piecewise-constant representation (histogram) H for A from the sketch, such that ||A—H||≤ (1+ε) ||A—Hopt||, where the error ||A—H|| is either ₁ (absolute) or ₂ (root-mean-square) error. The time to process a single update, time to reconstruct the histogram, and size of the sketch are each bounded by poly (B, log (N), log||A, 1/ε. Our result is obtained in two steps. First we obtain what we call a robust histogram approximation for A, a histogram such that adding a small number of buckets does not help improve the representation quality significantly. From the robust histogram, we cull a histogram of desired accruacy and B buckets in the second step. This technique also provides similar results for Haar wavelet representations, under ₂ error. Our results have applications in summarizing data distributions fast and succinctly even in distributed settings.
Gilbert et al. (Sun,) studied this question.