What question did this study set out to answer?

The research aims to enhance the efficiency of range queries in log-structured merge-trees using an innovative caching approach.

April 10, 2026Open Access

Improving Range Scan Performance in LSM-trees with Group Caching

Puntos clave

The research aims to enhance the efficiency of range queries in log-structured merge-trees using an innovative caching approach.
Introduced Group Cache using key-value groups as caching units
Developed a size-aware policy to prioritize small, high-utility KV groups
Conducted theoretical analysis and extensive experiments in RocksDB
Achieved up to 3× faster query performance under the same memory budget
Reduced memory usage by 75% while maintaining similar query performance
Demonstrated superior performance compared to traditional caching methods

Resumen

Log-structured merge-trees (LSM-trees) are widely used in modern key-value stores, but their multi-level structure reduces lookup efficiency, especially for range scans. Existing caching solutions, like block caches or full query caches, are memory-inefficient because they fail to exploit a critical asymmetry: eliminating an I/O from upper LSM-tree levels requires caching far fewer key-value pairs (KVs) than from lower levels. To address this, we introduce Group Cache, which uses KV Groups, the minimal set of KVs within a block for a specific query, as its fundamental caching unit. By employing a size-aware policy that prioritizes small, high-utility KV Groups, Group Cache maximizes I/O savings per unit of memory. We also address practical challenges like compaction management, intra-group hotness difference and scalability. Our theoretical analysis and extensive experiments in RocksDB demonstrate that Group Cache significantly outperforms traditional caching methods, achieving up to 3× faster query performance with the same memory budget, or achieving similar performance while using 75% less space.

Leer artículo completoexternamente

Me gusta

Guardar

Ver artículo completo