August 31, 2005

Cluster Validation by Prediction Strength

Puntos clave

Los puntos clave no están disponibles para este artículo en este momento.

Resumen

This article proposes a new quantity for assessing the number of groups or clusters in a dataset. The key idea is to view clustering as a supervised classification problem, in which we must also estimate the “true” class labels. The resulting “prediction strength” measure assesses how many groups can be predicted from the data, and how well. In the process, we develop novel notions of bias and variance for unlabeled data. Prediction strength performs well in simulation studies, and we apply it to clusters of breast cancer samples from a DNA microarray study. Finally, some consistency properties of the method are established.

Me gusta

Guardar

Cite This Study

Tibshirani et al. (Wed,) studied this question.

synapsesocial.com/papers/6a0da1be88250cfcc2a509bb https://doi.org/https://doi.org/10.1198/106186005x59243

Me gusta

Guardar