Solved – GLM vs square root data transformation

anovacount-datadata transformationgeneralized linear modelvariance-stabilizing

I am currently analysing some pretty awful/awkward data on the abundance of fish under three different "Hydro-Regimes" (5 abundance measurements for each regime – Short/Medium/Long). The current analysis plan had been a one way ANOVA.

Looking at the residuals vs. fitted and the normal Q-Q after plugging it into a linear models indicates right skew. To correct this, I tried to Sq. Rt transform the outcome variable, which gives a reasonable answer, with "short" being significantly different to "long" and "medium" after a post-hoc Tukey. However, plotting the data shows the error bars of "long" and "medium" not to overlap.

I also looked at using a Poisson GLM, which I have read is good for count and skewed data, but I am not sure if this is the right way to go. Any suggestions?

Best Answer

A Poisson model would most definitely be a sensible way to do this analysis.

Traditionally (before Poisson GLMs were available), such data were analyzed using square root transformations as a "variance stabilizing" transformation (i.e. so that the variance would be independent of the mean). The problem is that, when you transform the data, it becomes difficult to interpret the model coefficients. With a square root transformation of the data, the parameters estimate a difference in square roots of counts.

A Poisson GLM allows you to exploit the mean-variance relationship in count data to get better inference. The parameters estimate ratios of rates between the various treatment levels. And, due to the small sample size, it is a parametric modelling approach with reasonable assumptions that will give you relatively efficient inference.

Related Question