I have the following kind of data (coded in R):
v.a = c('cat', 'dog', 'dog', 'goat', 'cat', 'goat', 'dog', 'dog')
v.b = c(1, 2, 1, 2, 1, 2, 1, 2)
v.c = c('blue', 'red', 'blue', 'red', 'red', 'blue', 'yellow', 'yellow')
set.seed(12)
v.d = rnorm(8)
aov(v.a ~ v.b + v.c + v.d) # Error
I would like to know if the value of v.b
or the value of v.c
has any ability to predict the value of v.a
. I would run an ANOVA (as shown above) but I think it does not make any sense since my response variable is not ordinal (it is categorical). What should I do?
Best Answer
You could use ANY classifier. Including Linear Discriminants, multinomial logit as Bill pointed out, Support Vector Machines, Neural Nets, CART, random forest, C5 trees, there are a world of different models that can help you predict $v.a$ using $v.b$ and $v.c$. Here is an example using the R implementation of random forest:
Clearly these variables don't show a strong relation.