I am working on creating a cluster analysis for some very basic data in r for Windows [Version 6.1.76]. The groups themselves are countries and then I have 2 column with continuous numerical variables. I have applied a Ward Hierachical Method to the data
# Applying Ward Hierarchical Clustering
d = dist(conversion_set, method="euclidean")
fit = hclust(d, method="ward")
But I don't feel this represents what I am really trying to get to as it is just taking into account the first variable and disregarding the second. Is there a way to include both variables into the clustering calculations?
My data looks similar to this
Country – Var 1 – Var 2
US – 10 – 20
Canada – 5 – 30
….
Best Answer
Try this toy example
which puts Belize closer to the United States and Canada than Guatemala is, and also puts Mexico and Honduras closer together than to Guatemala, as in