Cloud block storage (CBS) provides virtual disks with block-level accessibility. The petabyte-scale CBS systems maintain trillions of block-mapping key-value entries as metadata to track the storage location of each virtual block. Although SSD-based KV stores have been widely adopted in cloud systems for their high efficiency and durability, current SSD-based schemes face significant challenges in achieving deterministic access latency for latency-sensitive metadata services. Our experimental observations indicate that the substantial long-tail latency is primarily caused by (1) I/O blocking due to internal tasks of SSDs including modern Zone Namespace SSDs; and (2) additional disk I/Os when querying high-level indexes across memory and SSDs under memory-constrained environments. In this paper, we propose an SSD-based SIndex to store trillions of block-mapping entries for latency-critical cloud block storage, which performs comprehensive latency optimization across storage I/O scheduling and high-level indexing. To prevent long-tail I/Os while avoiding intrusive device modifications, SIndex introduces an inter-SSD I/O scheduling mechanism based on read/write separation and SSD state transitions, which mitigates latency fluctuations induced by garbage collection on conventional SSDs and zone operations on Zone Namespace SSDs. Additionally, SIndex employs opportunistic I/O speculation and a concurrent request balancing mechanism to reduce read disturbance and I/O contention. To query the storage location of targeted block-mapping entries with bounded latency, SIndex proposes a memory-efficient high-level index incorporates with a static data layout, preventing time-consuming disk lookups by keeping the index in memory. We evaluate the SIndex prototype using a variety of benchmarks and real-world traces on commodity SSDs. The results demonstrate that SIndex outperforms RocksDB and other approaches by up to 11.4 × in tail latency, keeping the 99.99 th -percentile latency below 400µs.
Building similarity graph...
Analyzing shared references across papers
Loading...
S.L. Wang
Zhandong Guo
Kaiye Zhou
ACM Transactions on Storage
Huazhong University of Science and Technology
Wuhan National Laboratory for Optoelectronics
China Mobile (China)
Building similarity graph...
Analyzing shared references across papers
Loading...
Wang et al. (Thu,) studied this question.
www.synapsesocial.com/papers/69770353722626c4468e858c — DOI: https://doi.org/10.1145/3789205