Solved – Interpreting negative binomial regression with log transformed independent variables

incidence-rate-ratiolognormal distributionnegative-binomial-distributionregression

My independent variables were highly skewed, so to normalise the distribution they were log transformed. Also since there were zeros in the data, I've added + 1 to transform the variables. This is what the model looks like (negative binomial regression):

Dependant_var ~ log(Independent_var_1 + 1) + log(Independent_var_2 + 1)

Coefficients:

                            Est.       Std. Err.  z-value   sig.
log(Independent_var_1 + 1)  0.031907   0.004701   6.787 1.14e-11 ***
log(Independent_var_2 + 1) -0.019007   0.004735  -4.015 5.96e-05 ***

IRRs:

log(Independent_var_1 + 1)  1.0324219
log(Independent_var_2 + 1)  0.9811724

Now, I'm having problems understanding how to interpret the results. If the data were not log transformed, I would interpret this as follows:

If everything else is held constant, a one unit increase in Independent_var_1 would result in the decrease by 0.031 units of Dependent_var. And for IRRs – a one unit increase of Independent_var_1 will result in an expected increase of the Dependent_var by a factor of 1.032 (everything else constant).

However, I'm confused since I don't have "units" anymore, but log transformed vars.
Thanks.

Best Answer

The interpretation of coefficients associated with log-transformed independent variables is straightforward. You now have log units, which depend on the choice of basis for the logarithm. For natural log as in your example, an $e$-fold change in (Independent_var_1 + 1) is associated with the indicated change in the dependent variable. It might be simpler for a reader to understand if you to use base-2 or base-10 logarithms, so that the regression coefficient represents a doubling or a 10-fold increase in (Independent_var_1 + 1).

You might want to consider the suggestion provided here for an alternate way of dealing with the 0-value problem in logarithmic transformations, which handles cases having 0 values of an independent variable separately from cases having positive values.

Related Question