Key points are not available for this paper at this time.
Abstract Conventional minimum spanning tree (MST) algorithms typically start by creating a distance matrix of the n (n-1) /2 n (n - 1) / 2 pairs of data points, leading to a time complexity of O (n²) O (n 2). This initial step poses a computational bottleneck. To overcome this limitation, we present a novel method that constructs an initial random k -neighbor graph and optimizes this graph by employing a crawling technique to efficiently approximate the k Nearest Neighbors (k NN) graph. This crawling approach allows us to approximate the closest neighbors of each node. Subsequently, the approximate k NN graph is utilized to build an initial approximate MST and iteratively refine it by the same crawling process. Using this approach, an approximate MST can be obtained for a data set of size n with empirical cost around O (n^1. 07) O (n 1. 07) and a minimal O (n) memory consumption. In contrast to spatial tree-based approaches, the presented method also scales well to high dimensional data. We have shown that the proposed approach achieves such a level of performance with only a marginal accuracy reduction between 0. 5% and 6%. These qualities make it an excellent candidate for approximate MST calculation for high-dimensional, large data sets.
Almansoori et al. (Fri,) studied this question.