June 21, 2024Open Access

Fast and Memory-Efficient Approximate Minimum Spanning Tree Generation for Large Datasets

Key Points

Key points are not available for this paper at this time.

Abstract

Abstract Conventional minimum spanning tree (MST) algorithms typically start by creating a distance matrix of the n (n-1) /2 n (n - 1) / 2 pairs of data points, leading to a time complexity of O (n²) O (n 2). This initial step poses a computational bottleneck. To overcome this limitation, we present a novel method that constructs an initial random k -neighbor graph and optimizes this graph by employing a crawling technique to efficiently approximate the k Nearest Neighbors (k NN) graph. This crawling approach allows us to approximate the closest neighbors of each node. Subsequently, the approximate k NN graph is utilized to build an initial approximate MST and iteratively refine it by the same crawling process. Using this approach, an approximate MST can be obtained for a data set of size n with empirical cost around O (n^1. 07) O (n 1. 07) and a minimal O (n) memory consumption. In contrast to spatial tree-based approaches, the presented method also scales well to high dimensional data. We have shown that the proposed approach achieves such a level of performance with only a marginal accuracy reduction between 0. 5% and 6%. These qualities make it an excellent candidate for approximate MST calculation for high-dimensional, large data sets.

Read Full Paperexternally

Ask AI

Mark Helpful

Bookmark

Relay

View Full Paper