MATLAB: Kmeans explanation of information obtained

classificationclusterkmeansStatistics and Machine Learning Toolbox

>> [IDK,C,sumd,D] = kmeans(B,4);
I was wondering if someone could explain to me what C, sumd, and D values where for the kmeans function. I looked at the appropriate information on mathworks however it seemed vague.

Best Answer

IDK is the indices that match each of the N rows of input into a cluster number.
C are the locations of the centroids of the k clusters, each of which will have size(B,2) components.
D returns the distance between each of the N input rows and each of the k cluster centers. So this would be pdist2(B, C)
sumd is a cluster statistic that I am unsure of the use of. Numerically it is
for K = 1 : k
sumd(K) = sum(D(IDX == K, K));
end
that is, for each cluster, it is the sum of the distances between the points that belong to the cluster and the center of that cluster.
Related Question