The short answer is No.
The long answer follows, for which I fit a random forest to demonstrate variable importance (a.k.a variable ranking):
if(!require('randomForest')) { install.packages("randomForest"); require("randomForest") }
# Observe iris data
pairs(iris)
# Train & Test split
train = sample (1: nrow(iris ), nrow(iris )/2)
test=iris [-train ,"Species"]
rf.iris =randomForest(Speciesā¼.,data=iris ,subset =train ,
mtry=3, importance =TRUE)
yhat.rf = predict (rf.iris, newdata = iris[-train ,])
confusion_matrix <- table(yhat.rf, test)
Let's look at the class label distributions per each of the 4 numeric variables:
pairs(iris)
Focus on the bottom row of the figure (Species), which of the 4 variables carry more class discriminatory information?
Hopefully, you will answer the ones that correspond to subplots 3 and 4, i.e. Petal.Length
and Petal.Width
.
So, this is what the variable importance is capturing:
var_importance <- importance (rf.iris )
setosa versicolor virginica MeanDecreaseAccuracy MeanDecreaseGini
Sepal.Length 0.00000 -3.658955 4.588084 2.529800 0.4303867
Sepal.Width 0.00000 -3.411590 1.133001 -1.061102 0.2859101
Petal.Length 23.26742 26.463392 34.734821 37.700686 24.2050973
Petal.Width 23.25556 23.387203 30.062981 33.186258 24.2027126
Take the Petal.Length
variable for instance. The MeanDecreaseAccuracy
column tells us that if we exclude Petal.Length
from our classification exercise, the accuracy (max possible value 100) of our classification decreases by 37.700686. This is related to the concept of Mutual Information.
If you focus on the column MeanDecreaseGini
, this is another indicator of variable importance, which gives the average node impurity for the forest. This is measured by the Gini coefficient.
I hope it is clear how these two measures are different from the coefficient estimates in a logistic regression. They do not signify positive or negative impact on the class label. They judge how much class discriminatory information each variable contains.
You can interpret that Petal.Width
and Petal.Length
are the most useful variables for the classification task. Knowing these two variables for an observation (plant), decreases uncertainty and helps us to make more accurate predictions.
One thing to be careful about is that, while coming up with the importances, this technique looks at the variables individually. In some cases, it may be that, for instance, Sepal.Length
does not contain an awful lot of class discriminatory information on its own, but when combined with Sepal.Width
, it does carry a lot of information. This is not the case here, but is worth keeping in mind.
This last concept is discussed thoroughly in Sections 2.3 and 2.4 of this brilliant feature selection paper by Guyon et al.
Best Answer
Suppose all your covariates are continuous, each "Coef" represents the log odds ratio when each covariate is increased by 1 unit, while holding other covariates constant. "EXP(Coef)" is therefore the ordinary odds ratio.
To answer your first question, ideally you should not consider "Coef" or "EXP(Coef)" only, the standard error must be also taken into account. But if we assume the parameters can be estimated equally accurately, then for "Coef", you should compare the magnitude, namely, absolute value. For "EXP(Coef)", you should compare its "distance" from 1 (odds ratio = 1 means independence). The larger, or smaller the number to 1, the stronger the correlation is.
For the second question, "EXP(Coef)", i.e., (estimated) odds ratio, is of course always positive regardless the "Coef" is positive or negative. However, if "Coef" is negative, then "EXP(Coef)" is between $(0, 1)$, representing the response and covariate is partially negatively correlated. As above, when viewing it, you should compare it with 1 (instead of 0) to assess the covariate's effect. For example, 0.0159 should be considered a very strong association, as $1/0.0159 = 62.89$ is way bigger than 1.