Solved – put Log(Y) as a dependent variable in a count data model

count-datadata transformationgeneralized linear modelmodelingpoisson distribution

I have count data passenger as Y. The data look like thisenter image description here, as many of the values are 1 (about 18%.)

Does it make sense that I take a log of it, and take it as a dependent variable in a generalized linear model with Poisson distribution logY:

I know the link function is log for Poisson distribution. Did I have a problem to take double log of the Y? The question for me is that my Log(Y) model has a much better goodness-of-fit stat compared to my Y model. I tried some Poisson and Negative Binomial model and they are not fitting very well.

What other strategies may I try to model this data?

Best Answer

You can't apply a Poisson model to the variable called logP on your graph because it includes non-integers. A Poisson model can only be used for integers. You can probably still fit it in your software and get interpetable results, but you are not really using a Poisson model.

As @PeterFlom says, if your original variable is a count then log Y is not. If the original variable is a count and a Poisson model does not fit, then try a negative binomial model before you give up and start transforming the variable.