Effective prefetching is essential in modern storage systems to reduce I/O latency and improve cache hit rates, especially under challenging access patterns. While temporal locality has been the dominant foundation for most prefetching algorithms, real-world storage workloads often exhibit long stack distances (LSD), where blocks are reused only after millions of other accesses. These patterns weaken the effectiveness of conventional prefetchers, even when strong spatial locality is present. In this paper, we present CAPSULE (Clustering-Assisted Prefetching Scheme Utilizing Locality Exploration), a novel prefetching framework that exploits spatial locality through adaptive clustering. By dynamically grouping logical block addresses (LBAs) and prefetching across neighboring clusters, CAPSULE effectively bridges the gap left by temporal-only approaches. We evaluate CAPSULE across 729 real-world workloads drawn from five major benchmark suites (MSR, CloudPhysics, Tencent CBS, Alibaba Block and Meta Tectonic), reflecting diverse cloud-scale environments. CAPSULE improves cache hit rates by up to 6.1× and achieves up to 1.89× speedup in task completion time, outperforming traditional and learned prefetchers alike. Our results demonstrate that CAPSULE is particularly well-suited for modern cloud storage systems, where massive working sets and temporal locality erosion are increasingly the norm.
Ramadhan et al. (Thu,) studied this question.