Solved – General Linear Model Univariate with unequal variances – what are the options

heteroscedasticitylevenes-testmultiple regression

I'm using SPSS to run a GLM (general linear model) univariate with 1 fixed factor (Treatment) and one random factor (experimental replicate). There are 4 treatment groups. The measurement is number of cells per embryo. The Levene's Test for Equality of Error Variances is significant (P=0.000) and I can see from the Spread vs Level plot that there may be a pattern.

What are my options from here?

I have tried log transforming my data, and that increased the P value of the Levene's Test to P=0.02, but there still appears to be pattern in the Spread vs Level plot.

I know that I could use a Post hoc test that does not assume equal variances (Tamhane's T2 or Dunnett's T3), or I could use a Kruskal-Wallis H, but both of these are only possible with 1 factor, not two.

I would really appreciate any help with this!

Best Answer

For count data you certainly expect heteroskedasticity (and likely some skewness), and there are analyses that are specifically designed for several kinds of count response (specifically, the other kind of GLM).

With count data, a log-transform will "overcompensate" for the relationship between mean and variance, leaving you with the opposite pattern to the one you started with (the larger means will now be the ones with smaller spread).

A fairly typical analysis for an open-ended count would tend to involve a Poisson or negative binomial generalized linear model for the count, which should explain much of the observed heteroskedasticity.

If you must use a general linear model with a transformation the usual one for a Poisson count would be a square root, but it's not really as good as a model actually designed for counts. Given that some counts are as low as 10 you might even consider $\sqrt{y+\frac{3}{8}}$ or a Freeman-Tukey.

(See here for some discussion of the use of transformations with count data. There's a bit of information here that may also be helpful.)

There are lots of posts on site about the use of Poisson regression models, and other count-models, including negative binomial models.

I just realized I didn't talk about the random effect term. If you have a random effect in your model, that would suggest you might use generalized linear mixed models (GLMM). Again there are a number of posts on site about those. [I don't know whether SPSS does those but transformation may still give an adequate description of the data.]

Related Solutions

Solved – Significant output in Levene’s test for equality of variances in MANOVA; what to do

First, make sure you look at the boxplots of the residuals instead of just using Levene's tests. The significant result could be due to outliers, a bimodal distribution, or skewness that you may need to address.

To get equal variances, try log-transforming (ln) your dependent variables before you run MANOVA.

Also, there are four test statistics that can be used in MANOVA. Hotelling's trace and Pillai's criterion are the least affected by violations in assumptions, but Wilk's is the most commonly used.

Solved – Significant Result in Levene’s Test

None of this follows ineluctably from the evidence you give.

I ran Levene's test on my data and got a p-value of 0.000, meaning that variances are very heterogeneous.

Possibly so, possibly not. The result is highly significant, but that may just mean that you have a large enough sample size to allow firm rejection of the null. It could be that the difference in variances is not fatal to ANOVA.

 I transformed the data but no method can make them homogeneous.

Possibly so, possibly not. We can't tell without looking at the data and hearing what you tried. Perhaps you missed out a transformation that would help. (I've seen people try transformations that make their problem worse; that need not be you, but you don't give enough detail for us to be sure.)

So that's it, ANOVA would be inappropriate to use. I was thinking that my data could be nonparametric so Kruskal-Wallis would be the best test to use. However, when I tried testing the data for Levene's test for nonparametric data, I still got a significant result.

Data are not parametric or non-parametric, just techniques. That's a misuse of terminology. See notably @Glen_b's answer here More crucially, I don't know what Levene's test for nonparametric data means. What makes you think that Kruskal-Wallis requires any such prior test?

I'd recommending that you back up and show us your data, or at least informative graphs, and tell us what interests you about them.

Best Answer

Related Solutions

Solved – Significant output in Levene’s test for equality of variances in MANOVA; what to do

Solved – Significant Result in Levene’s Test

Related Question