Previous research on join sampling has focused on joins without selection conditions, even though such conditions are prevalent in everyday queries in database systems. Motivated by this, we undertake a systematic investigation on the complexity of sampling from the result of an acyclic join under equality conditions given only at runtime. When conditions are conjunctive, the goal is to understand when it is possible to precompute a feasible structure that uses Õ(IN) space and supports sampling in Õ(1) time, where IN is the input size. We present a dichotomy to characterize (subject to a widely-accepted conjecture) the existence of such structures based on the conditions supplied and, in every feasible scenario, give an optimal structure of O(IN) space and O(1) sample time. We then extend our investigation to conditions expressed in disjunctive normal form, where the core challenge reduces to the fundamental set union sampling problem. We overcome the challenge with an optimal algorithm and utilize it to develop optimal sampling structures. Our findings also lead to new results on the closely-related random enumeration problem.
Building similarity graph...
Analyzing shared references across papers
Loading...
Jinchao Huang
Yufei Tao
Sibo Wang
Chinese University of Hong Kong
Building similarity graph...
Analyzing shared references across papers
Loading...
Huang et al. (Thu,) studied this question.
www.synapsesocial.com/papers/69be356f6e48c4981c673be2 — DOI: https://doi.org/10.4230/lipics.icdt.2026.9