Key points are not available for this paper at this time.
Abstract Hartigan (1975) defines the number q of clusters in a d ‐variate statistical population as the number of connected components of the set f > c, where f denotes the underlying density function on R d and c is a given constant. Some usual cluster algorithms treat q as an input which must be given in advance. The authors propose a method for estimating this parameter which is based on the computation of the number of connected components of an estimate of f > c. This set estimator is constructed as a union of balls with centres at an appropriate subsample which is selected via a nonparametric density estimator of f. The asymptotic behaviour of the proposed method is analyzed. A simulation study and an example with real data are also included.
Cuevas et al. (Thu,) studied this question.