What question did this study set out to answer?

To address performance challenges in learned indices through a novel structure that improves stability and reduces latency.

May 20, 2026

LINE: A Learned Index with Group-Enhanced Leaves and Cache-Optimized Inner Tree

Puntos clave

To address performance challenges in learned indices through a novel structure that improves stability and reduces latency.
Proposed LINE with group-enhanced leaf nodes and a cache-optimized inner tree.
Implemented in-group hashing for handling local data hardness.
Utilized a segment-based piece-wise linear approximation algorithm for cache-aware error bound selection.
LINE achieved up to 5.8x performance improvement over traditional learned indices.
Reduced worst-case tail latency significantly, enhancing overall efficiency.
Showed improved cache performance in comparison to existing methods.

Resumen

Compared to traditional indices, learned indices exploit the underlying data distribution to construct trees with larger node sizes and lower tree heights, thereby achieving good performance. However, the underlying data characteristics can significantly impact the index performance, and the large tree nodes incur heavy structure modification operations (SMOs). This leads to three main challenges for learned indices to achieve more stable performance: local data hardness, global data skew, and worst-case tail latency. In this paper, we propose LINE, a learned index that consists of a set of group-enhanced leaf nodes and a cache-optimized inner tree for addressing the three challenges. First, a group-enhanced leaf node employs a linear model to map keys in the leaf node into groups. We create variable-sized groups and perform in-group hashing to handle local data hardness. Second, the bulkload data set is divided into leaf nodes and the leaf boundary keys are inserted into an inner tree during index construction. We perform cache-aware error bound selection with a segment-based PLA (piece-wise linear approximation) algorithm to reduce the inner tree size. In this way, we simplify the inner tree structure and improve its cache performance for dealing with global data skew. Finally, we decompose lengthy SMOs into group-wise migrations for reducing the worst-case tail latency. Our experiments on real-world and synthetic data sets show that compared to state-of-the-art learned indices, LINE achieves up to 5.8x performance improvement while dramatically reducing the worst-case tail latency.

Me gusta

Guardar

Me gusta

Guardar

LINE: A Learned Index with Group-Enhanced Leaves and Cache-Optimized Inner Tree

Puntos clave

Resumen

Cite This Study