Solved – Multiplicative error and additive error for generalized linear model

gamma distributiongeneralized linear modellognormal distribution

If the following generalized linear model was used, how should I interpret the error term?
link function: natural log
distribution: Gamma distribution
i.e., $\ln E(Y)=X\beta$ and $E(Y)=\exp(X\beta)$
It seems that the error term should be additive:
$Y=\exp(X\beta)+\epsilon$
However, after some algebra, it becomes:
$Y=\exp(X\beta)\psi$ by letting $\psi_i=\frac{\epsilon_i}{\exp(x_i'\beta)}+1$
Thus, $\epsilon$ is additive but $\psi$ is multiplicative.

I am confused with this article:
"Multiplicative errors: Log-normal or gamma?" by Firth, 1988
Firth established the multiplicative errors model for gamma distributed response variable.
However, after the computation by R or SAS, how do I interpret the error term?
Does R glm() or SAS proc genmod provide us the multiplicative error model (as in Firth, 1988)?

Moreover, why is the following model not available in R or SAS?
distribution: lognormal distribution
link function: identity
It seems (in many literatures) that we have to log-transform the response,
and apply the following model (or using OLS):
distribution: normal distribution
link function: identity
However, in comparison to the gamma model (log link), we must construct the following model:
distribution: lognormal distribution
link function: natural log
But it could not be estimated in R glm or SAS proc genmod.

After some research on the topic "natural exponential family",
http://en.wikipedia.org/wiki/Natural_exponential_family
I guess that the algorithm in R glm() or SAS proc genmod cannot solve models in which the distribution is not in the natural exponential family. (if the Fisher-scoring is used, not quasi-likelihood)
It seems that the "quasi-likelihood" must be applied in order to solve the models for lognormal models without log-transform.
However, these are all my guess. I need some literatures (journal articles, books, etc.) to be cited in order to convice others.
Or a more rigorous proof may be helpful.

Best Answer

With GLMs, it's generally best not to think of them as "conditional mean + error" -like models but as "conditional distribution" models.

In the case of the Gamma model, note that the variance is proportional to the square of the mean. If you really want to write an error term and you have a log link, you can either write it as an additive error model on the log-scale (with constant variance) or on the original scale as a multiplicative error model (but with changing variance). I wouldn't do it as an additive model on the original scale.

The log of a gamma random variable isn't at all bad to deal with, so the additive error version is kind of convenient, if you want to deal with an error-model.

Beware, however -- the additive error version results in a term with a non-zero mean (it's also left skew, but that's less of a big deal). It's easy enough to compute an adjustment for that non-zero mean, though, so that you can correct for bias on the log-scale (or you could even fit a least-squares model to the logs and compute an adjustment for the original scale).


It seems that the "quasi-likelihood" must be applied in order to solve the models for lognormal models without log-transform.

In fact, if you want ML estimation, taking logs is basically the most sensible way to estimate the parameters of that log-normal model.

To try to do that on the original scale is just making your life hard.

See here, starting at "The MLE is also invariant with respect to certain transformations of the data." down to "For example, the MLE parameters of the log-normal distribution are the same as those of the normal distribution fitted to the logarithm of the data."


Note that lognormal and gamma models aren't the only models which would be suitable for a model that's linear in the logs, and which has constant variance on the log-scale, they just happen to both be quite convenient.