Solved – Clustering can be plotted only with more units than variables

clusteringr

I am using R software (R commander) to cluster my data. I have a smaller subset of my data containing 200 rows and about 800 columns. I am getting the following error when trying kmeans cluster and plot on a graph:

'princomp' can only be used with more units than variables

I then created a test doc of 10 row and 10 columns whch plots fine but when I add an extra column I get te error again. Why is this? I need to be able to plot my cluster. When I view my data set after performing kmeans on it I can see the extra results column which shows which clusters they belong to.

Is there anything I am doing wrong, can I ger rid of this error and plot my larger sample?

Best Answer

The clustering itself has no problems with the p>n situation, however the visualization internally uses princomp (which is incapable of handling p>n) to plot the similarity space projection.

You can't fix that, at most try to reproduce similar graph by obtaining similarity space projection with cmdscale(dist(...)) and coloring the points with clusters.

Related Question