High-occupancy itemset mining (HOIM) extends frequent itemset mining by requiring anitemset to occupy a sufficiently large fraction of the transactions in which it appears. Thisdensity-oriented objective is useful when support alone is not discriminative, but it also destroysthe direct anti-monotonicity that makes classical frequent itemset mining efficient. Existinghigh-occupancy itemset miners further complicate evaluation because their outputs are notalways the same object: some enumerate threshold-complete raw itemsets, while others reportadaptive, maximal, top-k, or closed representative patterns. This paper presents AURA-HOI,an auditable high-occupancy itemset mining algorithm designed to separate scoring semanticsfrom output views. AURA-HOIsupports a raw fullset mode for direct comparison with HEPand DFHOI under a shared support–occupancy threshold, and a support-class representativemode for compact closed-output analysis. The method combines vertical bitset evidence, residualoccupancy-envelope pruning, and a support-equivalence ledger. Experiments implemented inC and executed on a laptop-scale environment evaluate AURA-HOI, HEP, and DFHOI onfive transactional datasets: mushrooms, chess, retail, T10I4D100K, and kosarak. Across all 35dataset–threshold configurations, the three algorithms emit identical raw itemset counts undermatched semantics. AURA-HOI is faster than HEP on 25 of 35 configurations and faster thanDFHOI on 15 of 35 configurations; it also uses lower peak memory than HEP on 28 of 35configurations and lower peak memory than DFHOI on 27 of 35 configurations. The resultsshow that the proposed audit-oriented design preserves raw-output equivalence while providingcompetitive runtime and stable memory behavior across dense, sparse, synthetic, and largeclickstream workloads.
Minh Quan Van Ha (Sun,) studied this question.