We study linear sketches and data stream algorithms for the ℓp-sampling problem. Here one chooses an r x n matrix A and computes A · v for an underlying vector v. From A · v, one should output a coordinate i ∈ 1, 2,. . . , n with probability (1 ± ε) ∥vi∥p/ ∥v ∥ p/ p. One is also allowed to output FAIL with a small constant probability. In the estimation version of the problem, one would also like to output a (1 ± ε) -approximation to ∥vi∥p/ ∥v ∥ p/ p, which is the probability for which i was sampled. For 0 < p < 2, work of Jayaram and Woodruff (FOCS, 2018) showed an upper bound of r = O (logn) while for p = 2 the best known upper bound is r = O (log 2 n). They posed it as an open question to resolve this gap. We prove an Ω (log 2 n) sketching dimension lower bound for p = 2, thereby separating the complexity for 0 <p < 2 from that of p = 2, and showing that their upper bound is optimal. Further, we show an Ω (ε-2 logn) sketching dimension lower bound for the estimation problem, improving the previous Ω (ε-2) bound and showing optimality of their algorithm. These results give sketching dimension lower bounds for real-valued inputs, but by applying a result of Gribelyuk, Lin, Woodruff, Yu, and Zhou (STOC, 2025), we obtain the same sketching dimension lower bound for poly (n) -valued integer sketches on poly (n) -bounded integer-valued streams, thereby showing the optimality of Jayaram and Woodruff’s sketch for turnstile streams.
Swartworth et al. (Tue,) studied this question.