MATLAB: How to estimate K for K-means clustring

k-means clustering unsupervised estimation

I'm working on unsupervised classification or clustering, i want to estimate the K (which refers to cluster number) before starting th k-means algorithm

Best Answer

You will probably not find any code already implemented for this purpose.
The theoretical answer for the "best" number of clusters to use is "one cluster for every unique point", as that will always have the best possible fit.
If you do not wish to use one cluster for every unique point, you need to have some kind of penalty term that favors fewer clusters. I read through the theory paper on that a few years ago, and it was clear to me that they were setting the weights arbitrarily (but usefully for the kinds of clustering they were doing), and that there was no way to calculate what the weights should be without some knowledge of the range of number of clusters that would be appropriate for the physical system being examined. The theoretical algorithms were not suitable for "unsupervised learning", only for "supervised learning". The work we were doing at the time required unsupervised learning, so there was no way for us to determine what the proper number of clusters should be.
Related Question