I need to perform a two-way ANOVA on my data. My data is from a non-normal population. Apparently there is no two or three factor test for non-normal populations. I realized I need to transform my data, but I'm unsure about which transformation to perform on my data, I don't know which is the most appropriate. I don't know what is the criteria to choose one from the transformation list of possibilities?
Solved – non-normal data for two-way ANOVA, which transformation to choose
anova
Related Solutions
When choosing a test you have to consider two important things: A: is the test reliable when the ANOVA assumption have been violated; the question is if the test performs well when the group sizes are different and when the population variances are very different or you have not normally distributed data; B: does the test control over the Type I/Type II error rate; statistical power of a test and type I error rate are very related e.g: you can opt for a more conservative test, aiming small probability of Type I error, but you will loose statistical power. It is a trade-off.
Furthermore, Bonferroni and Tukey test are conservative - high control over type I eror rate bat low statistical power; Games-Howell is powerful but not appropriate for small sample. Games-Howell is accurate when sample sizes are unequal. For all of the you should be careful ANOVA assumptions;
Moreover you said: "My sample size is only 3 to 4 individuals per experimental group." but I think this is not enough when it comes to test ANOVA assumptions.
This is a detailed book on the topic. Andy Field is a great teacher and here has a nice video on post-hoc. Also there and there are relevant documents on your question.
Regarding your question in comment: I can say use this test or this one, but the main idea is that you have to know them well, the difference between them and the trade-off; after this you have to decide for one, two or more, and you have to be able to motivate and explain your decision and all of these in relation with your research and data not with the test 'per se'. Moreover, usually 'to assume' is not ok in statistics...therefore you have to test the normality and all ANOVA assumptions. Further, IMHO ANOVA it's ok but the group size is not ok. Considering your exigencies (in terms of significance level, power, no of groups etc.) you can compute a needed sample size per group ( using R, or using many other free resources on the web). I would like avoid to give you a 'cooked dish' because you wont gain anything, but to not make your life harder I say: if I were you I would use ANOVA, 30 individuals per group (for a 2X3 design you need n~180 individuals), I would use Tukey, REGWQ, and Bonfferoni.
If you're interested in comparing means, once you transform you end up with a comparison of things that are not means. If the right assumptions hold you can still test for a difference, but the alternative won't be location-shift.
I didn't want the details to detract form the general point.
On the other - and more important - hand, if you omit essential details you'll be more likely to end up with less useful - or even potentially misleading - answers that you won't even realize aren't the answers you need.
By leaving out the fact that you were dealing with count data, you were risking exactly that. While leaving out unnecessary detail is probably useful, knowing it's count data is pretty much central to the problem.
There are techniques for comparing means that are suitable for count data. With some more information about the kind of analysis/information you were after (even if it's what you would have done if the data were normal), we may be able to guide you better.
Transformation is less useful than doing something suited to your actual data.
Best Answer
Transformation that will change the shape leaves you no longer comparing means. If you really want to compare means you may want to avoid transform (there can be some particular exceptions where, at least with some accompanying assumptions, you can compute or approximate the means on the original scale as well).
If you don't need an estimate of the difference in means on the original scale (i.e. if effect sizes aren't critical to your analysis), then full-factorial models (i.e. with all interactions present) may work well enough with transformation.
If you are happy with more general location-comparisons than just means, there are other alternatives than transformation.
If you do want to compare means there are other alternatives than transformation. I'm not saying 'never use transformation'... but 'consider alternatives'.
This is untrue. This could be done with GLMs for example. Or via resampling.
Non-normality may not be the biggest issue you have (heteroskedasticity tends to have a bigger impact, one that doesn't diminish so nicely with sample size)
A nonlinear transformation will change many things. In your case, the important ones are distributional shape, variance of the transformed variables, and what means on the transformed scale correspond to on the original scale and vice versa. (In a regression situation there's also the impact on linearity of relationships)
You might choose a transformation that takes you to nearly constant variance. You might choose one that takes you to near symmetry. You might choose one that does either of those things less well, but is more interpretable.
If you're very lucky, you might be in a situation that gets you more than one of those at once.
But again, my advice is to first consider alternatives. As a first step, you might want to investigate what could be done with GLMs.
What are the characteristics of your data? What makes you say they're non-normal? Do you have counts? Are the data highly skew*?
* note that its not the unconditional distribution of the response that's crucial, but the conditional distribution.