Abstract In clustering, selecting the most appropriate partitioning of a dataset is often guided by clustering validity indexes. However, with numerous competing indexes each with its own strengths and weaknesses, choosing the right one can be challenging and may significantly affect clustering outcomes. Despite their widespread use, limited research has explored how index performance varies across problem types, with traditional benchmarks focusing on ground-truth properties that cannot be known prior to clustering. Instance Space Analysis (ISA) is a visual meta-learning methodology that provides tools to examine the relationship between problem features and algorithmic performance. This study presents the first application of ISA to clustering validity indexes, analysing the behaviour of nine indexes across a diverse set of 18,351 synthetic benchmark datasets and eight clustering algorithms. The results uncover distinct performance patterns and offer data-driven guidance for selecting appropriate indexes based on measurable problem characteristics, providing insights into the relative strengths and weaknesses of commonly used indexes.
Simpson et al. (Mon,) studied this question.