Solved – How to interpret the coefficients returned by cv.glmnet? Are they feature-importance

glmnetlogisticmultinomial-distributionr

I have a large amount of vegetation data that has been broken down into 13 habitat classes. I am trying to determine which vegetation tends to fall into or is absent from which habitat with any sort of significance. I have been put onto running a multinomial logistic regression, specifically using glmnet (as I have approximately 200 variables, and only about 260 observations).

Running cv.glmnet using the code:

cv<-cv.glmnet(data,Class,family="multinomial",nfolds=50,standardize=FALSE)

I get a list of numbers that I am struggling to understand, however I found the code:

coef(cv, s=cv$lambda.1se)

Which returns the coefficients for each variable for each habitat class for the lambda that is 1 SE larger than the minimum Lambda value (which as far as I can tell the generally accepted lambda value).

(Intercept)                                              0.7914263664   
Salix                                                    0.0000000000  
Mash                                                     0.0000000000   
Pin                                                      0.0000000000   
Choke                                                    .          
Betula                                                   0.0025260258   
Ideae                                                    0.0000000000   
Leather                                                  0.0000000000

What I'm wondering, using these coefficients, is it possible to state that those values with the largest magnitude (either closest to -1 and +1) are the most important in defining that class, which those close to 0 are unimportant, and those with periods were removed during the cv.glmnet. So in this case the plant "Betula" would be more influential than all others, and "Choke" was so uninfluential that it was removed? Also, no idea what intercept means, but I imagine I can find that one on my own.

Best Answer

First of all, any variable with a coefficient of zero has been dropped from the model, so you can say it was unimportant.

Second of all, you can't really make inferences about the importance of coefficients, unless you scaled them all prior to the regression, such that they all had the same mean and standard deviation (and even then you have to be careful!). If your variables are un-scaled, variables with larger averages will tend to have larger absolute coefficients.

Another option would be to bootstrap sample your data, fit a model to each sample, and calculate confidence intervals around your coefficients.

Finally, how are you choosing the "alpha" parameter for your model?