February 28, 2024

Discriminative Dimension Selection for Enhancing the Interpretability and Performance of Clustering Output

Key Points

Key points are not available for this paper at this time.

Abstract

Discriminative Dimension Selection (DDS) has emerged as a powerful tool for identifying the most relevant features in high-dimensional datasets, enabling interpretable data analysis and visualization. This paper explores the application of DDS which utilizes overlapping clusters and dimensions to enhance the interpretability and performance of the K-means clustering algorithm. Our approach leverages the post-processing capabilities of K-means to selectively retain informative features and discard redundant or irrelevant ones. This refined feature set not only preserves the clustering performance of K-means but also enhances its interpretability and visualization. We demonstrate the effectiveness of our method using a variety of datasets and compare its performance against traditional K-means and the proposed method Overlap-resolved Clustering (ORC). We also test with multiple cluster validity indices such as the Silhouette Coefficient Score (SS), Davies-Bouldin Index (DB), and Calinski-Harabasz Index (CH). Our results consistently show that ORC produces better clustering results and enhances the interpretability of the datasets. This study highlights the potential of DDS as a valuable tool for improving the interpretability and visualization of high-dimensional data analysis using K-means clustering.

Discriminative Dimension Selection for Enhancing the Interpretability and Performance of Clustering Output

Key Points

Abstract

Cite This Study