Solved – GLM post hoc with non-parametric tests

generalized linear modelheteroscedasticitykruskal-wallis test”post-hoc

I have a question regarding the appropriate use of comparisons for independent samples (3 factor levels). Overall sample size is N=546, subsamples: 218 or 228 or 100), convenience sampling, stratified.

I use ANOVA with post hoc Tukey, if Levene indicates variance homogeneity.

My question: If the variances are unequal, what to do?

a. Using Kruskal-Wallis? For post hoc: Do I use Mann-Whitney for pairwise comparisons (if so: what about the alpha inflation?)?

b. Still using GLM but with non-parametric post hocs? If so: which one? Tamhane? Dunnett or Games-Howell? What about F, can I still use it? Isn’t H then the more appropriate statistic? Why gives GLM anyway the option of running on-parametric tests of homoscedasticity is an assumption (I am confused)

Thanks for any input on this one

Best Answer

The question seems to rely on a mistaken notion$^\dagger$.

A generalized linear model (GLM) does not in general assume constant variance.

Instead, there's an assumed variance function, $v(\mu)$, that relates the variance to the mean, $\text{Var}(Y_i)= \phi\,v(\mu_i)$, based on the particular distribution family in the exponential-family class of distributions.

So in the generalized linear model, interest focuses not on constant variance, but on a correctly specified variance function*.

* as in any model, George Box's famous aphorism applies - so we don't generally believe a variance function to be exactly correct, just a close enough description that the resulting inferences will be good enough for our particular purposes.

As a result a formal test of correctly specified variance doesn't really make sense, since it's answering a question we already know the answer to (no, it's not exactly correct), and any sufficiently large sample would tell us so.

Further, and more practically, even with the potential for an incorrectly specified variance (to an extent where the effect is substantial), choosing your procedures on the basis of a formal test of assumptions may be less advisable than simply not making an assumption you're not comfortable with. In the case of normal models at least, a number of papers indicate that it's better not to use a procedure that doesn't assume constant variance.

$\\$

$\dagger$(or, just possibly, a difference from the usual terminology, in which case your intent should be made more explicit)

Related Question