What question did this study set out to answer?

This research aims to enhance the computation of Kendall's τ correlation in large datasets through a scalable grid-based algorithm.

May 7, 2026

Scalable Grid-based Computation of Kendall's τ Correlation

Key Points

This research aims to enhance the computation of Kendall's τ correlation in large datasets through a scalable grid-based algorithm.
Developed a geometric grid-based approach for partitioning data
Implemented parallel and distributed computation of correlations
Introduced an approximate algorithm with a provable error bound
Achieved improved efficiency and scalability in correlation computation
Demonstrated a single pass over grid statistics yields accurate results
The new approach outperformed the state-of-the-art correlation algorithm

Abstract

Computing the correlation of two attributes in a large dataset is an important problem, with many applications, including exploratory analytics and dimensionality reduction. Among the well-known correlation measures, Kendall's τ is the most robust one, as it is immune from parametric assumptions and outliers. On the other hand, computing Kendall's τ for large-scale data becomes challenging (i) due to the superlinear cost of the state-of-the-art algorithm and (ii) because all data need to be memory-resident for efficient processing. In this paper, we address the problem via a geometric approach that partitions the data in the cells of a grid, and exploits the relative position of the cells to compute correlation information en masse. Our approach facilitates parallel and distributed computation of Kendall's correlation; we propose a scalable algorithm in this direction. Finally, we propose an efficient approximate algorithm with a provable error bound, which derives accurate results by a single pass over the grid statistics. Our experimental evaluation demonstrates the efficiency and scalability of our grid-based techniques compared to the state-of-the-art algorithm.

Bookmark

Scalable Grid-based Computation of Kendall's τ Correlation

Key Points

Abstract

Cite This Study