Solved – Cluster analysis of boolean vectors in R

clusteringr

I have 114 vectors with 6 boolean attributes. I saw that might be several distinct clusters in a simple visualization. K-means clustering on the transformed vectors (true = 1, false = 0) results in roughly the clusters that I had seen in the visualization.

However, I am not sure what the most appropriate clustering method for this kind of data is, and how to determine the confidence in those factors (the k-means results change every time due to randomization). Should I treat the data as nominal or as numerical data?

What would be the best way to do a cluster analysis on this kind of data in R?

Best Answer

I would have a glance to the mona function in the cluster package. It seems to address your question.