By Ulrike von Luxburg, Max Planck Institute for Biological Cybernetics, Germany, ulrike.luxburg@tuebingen.mpg.de
A popular method for selecting the number of clusters is based on stability arguments: one chooses the number of clusters such that the corresponding clustering results are "most stable". In recent years, a series of papers has analyzed the behavior of this method from a theoretical point of view. However, the results are very technical and difficult to interpret for non-experts. In this monograph we give a high-level overview about the existing literature on clustering stability. In addition to presenting the results in a slightly informal but accessible way, we relate them to each other and discuss their different implications.
Clustering Stability: An Overview provides a high-level overview about the existing literature on clustering stability. It reviews different protocols for how clustering stability is computed and used for model selection. The main body of the text goes on to examine theoretical results for the K-means algorithm and discuss their various relations. Finally, it looks at results for more general clustering algorithms. In addition to presenting the results in a slightly informal but accessible way, Clustering Stability: An Overview relates them to each other and discusses their different implications.