Solved – Best analysis for count data as response variable

count-datageneralized linear modelr

I want to know what is the best way to analyze a data set where my response variable is count data and my explanatory variables are continuous variables. All my variables are not normally distributed. Are GLMs a good option?

Best Answer

They are. You may want to look at Poisson regression (in R: glm(..., family=poisson, ...)) or, if you have overdispersion, Negbin regression or, if you have "too many" zeros, ZIP regression (Zero-Inflated Poisson).

Whether the predictors are normally distributed does not matter. (Except for analyses of influential data points.) What you probably have in mind is whether residuals are normally distributed. This is an important assumption in Ordinary Least Squares - more specifically: for inference in OLS. However, your data are counts, so residuals will not be normal and you are not thinking about OLS, anyway.

Related Question