Solved – How to compare coefficients of a negative binomial regression for determining relative importance

modelingnegative-binomial-distributionrregressionstandardization

I'm working in R, using glm.nb (of the MASS package) to model count data with a negative binomial regression model. I'd like to compare the relative importance of each of my predictor variables regarding their impact on the response variable (note: the predictors each have quite different scales – sometimes by orders of magnitude). Unfortunately, the output from R gives me results as unstandardized (b) coefficients ("estimates"). I'm hoping someone can give me a hint as to how to go about getting standardized (beta) coefficients from the NB regression model… or another 'better' way to determine the relative importance of each of my predictors on my response variable.

I've investigated several potential ways like:

  1. using the R package 'relimpo' (as suggested in a comment to https://stats.stackexchange.com/a/7118), but it does not work on a NB regression model, thus completely changing the assumptions I should be accounting for and making the outcomes very different;
  2. mean-centering and scaling my data, which changes the interpretation and makes it so that I can't use NB model due to response variables now having negative values;
  3. scaling-only, so that I can still run a NB model… which I thought would only affect the scale of the coefficients without changing their direction (viz., https://stats.stackexchange.com/a/29784 ) – but I do get some positive coefficients that flip to neg. and vice-verse… which seems strange to me and makes me wonder whether I'm making a mistake.

I've benefited from looking at When conducting multiple regression, when should you center your predictor variables & when should you standardize them? (and the suggested links from comments on the question such as http://andrewgelman.com/2009/07/when_to_standar/ and When and how to use standardized explanatory variables in linear regression and Variables are often adjusted (e.g. standardised) before making a model – when is this a good idea, and when is it a bad one?).

Bottom line: I have not yet found a way to use a NB model in R (which I have statistically confirmed is more appropriate than lm, glm, or poisson for modeling my data) and still get at the relative importance – or at least to the standardized beta coefficients – for my predictors…

The R scripts is something like this:

library("MASS")
nb = glm.nb(responseCountVar ~ predictor1 + predictor2 + 
  predictor3, data=myData, control=glm.control(maxit=125))
summary(nb)

scaled_nb = glm.nb(scale(responseCountVar, center = FALSE) ~ scale(predictor1, center = FALSE) + scale(predictor2, center = FALSE) + 
  scale(predictor3, center = FALSE), data=myData, control=glm.control(maxit=125))
summary(scaled_nb)

Best Answer

First you'd have to figure out what change in one variable is "equal" to a what change in another. The usual standardization uses the standard deviation, but that may or may not be ideal. It may not be possible to figure this out - particularly if the IVs are related to each other, in which case a change in one would go with a change in another.

Once you've figured that out, you can get the predicted values from various combinations of the IVs, varying each by the amount you thought was "equal" in the first step.

Another thing to do is to graph the predicted results as the independent variables change in value.