Solved – Comparing log-log regression to poisson regression

poisson distributionpoisson-regressionregression

Lets say I have a random variable $y$ that is expected to be:

$$y_i \sim poisson(\lambda_i)$$

$$log(\lambda_i) = \beta_0 + \beta_1x_i$$

But, for every reasons, I am running the following linear regression using OLS:

$$log(y_i) = \beta_0 + \beta_1log(x_i)$$

Given the log-log transformation, I could than say "a 1% change in $x$ causes a $\beta_1$% change in $y$". Switching back to the poisson context, how would I set up a regression model where a similar interpretation (% change in $x$ causes $\beta_1$% change in $y$) is possible?

Since the log is the common link-function for a poisson glm, I do not need to transform the lhs of the equation. So transforming $x_1$ should be sufficient?

Best Answer

The transform on $X$ is not a key difference between the two methods, because, like you have noticed, you can also do it in Poisson regression without a problem. The essential difference is about $Y$: transform is not link (GLM). You see the difference clearly when you write the formulas as a conditional mean.

Linear model transformed with log is:

$$E(\log(Y)|X)=\beta_0+\beta_1X$$

GLM with a log link is (as in Poisson regresion):

$$\log(E(Y|X))=\beta_0+\beta_1X$$

Even if they look the same, they are not the same at all (because $\log$ is not linear).

If you are interested in having no bias on $E(Y|X)$ then GLM is the model to choose. With a transformed linear model, there is a (usually strong) bias. The subtlety happens in the way the noise ($\epsilon$) is transformed in a non linear way. The noise has mean 0, but when transformed by $\log$ it modifies the mean of the estimation.