MATLAB: K-means algorithm: start and replicates parameters

clusteringk-meansMATLABstatistics toolbox

Hello,
I am using the k-means to cluster a set of data stored in a 300×20 matrix called "Data1".
The predifened centers are stored in a 10×20 matrix called "centers1".
Here is the code:
[cidx,ctrs,sumd,D]=kmeans(Data1,10,'dist','sqEuclidean','emptyaction','drop','rep',500,'start',centers1,'disp','final');
I get the following message:
_The third dimension of the 'Start' array must match the 'replicates' parameter value_
Any help?
Thank you very much.
Natasha

Best Answer

The 'start' parameter defines the starting centroids. The 'replicates' parameter indicates how many times to repeat the clustering. You only gave one set of starting values, so it doesn't know what to make of that. I can imagine two things:
If you want to start at centers1 once, and at a random value the other 499 times, try running with 'start',centers1 once and then 'rep',499.
You could have intended to start at centers1 500 times, and you could if you specify the 'start' value as repmat(centers1,[1 1 500]). But I don't think there is any randomness built into the algorithm after the selection of starting values, so I don't think this would help.