Solved – Calculating p-value for a two-way ANOVA

anovap-value

I would like to compute the p-value of my two-way ANOVA. The score that I am using to detect significant samples is the eta score which is computed as SS(between)/SS(total).

In many sites I saw that F=Var(between)/Var(within).
I want to know if I can consider the eta score as the F-statistic value, and then compute the p-value?

What if I want to compute the p-value for other scores such as eta-partial, or omega?
Is it meaningful if I calculate the p-value for them? Or is it only meaningful for the ratio Var(between)/Var(within)? What difference does it make if I use SS(between)/SS(within) instead of Var?

The formula that I am using to calculate p-value is:

pvalue=-log10(betai(0.5 * df2, 0.5* df1, df2 / (df2 + df1 * eta)));  

based on the book Numerical Recipes in C, where df1 and df2 are degrees of freedom.

Thanks for your help.

Best Answer

The eta-square ($\eta^2$) value you are describing is intended to be used as a measure of effect size in the observed data (i.e., your sample), as it amounts to quantify how much of the total variance can be explained by the factor considered in the analysis (that is what you wrote in fact, BSS/TSS). With more than one factor, you can also compute partial $\eta^2$ that reflect the percentage of variance explained by one factor when holding constant the remaining ones.

The F-ratio (BSS/WSS) is the right test statistic to use if you want to test the null hypothesis ($H_0$) that there is no effect of your factor (all group means are equal), that is your factor of interest doesn't account for a sufficient amount of variance compared to the residual (unexplained) variance. In other words, we test whether the added explained variance (BSS=TSS-RSS) is large enough to be considered as a "significant quantity". The distribution of the ratio of these two sum of squares (scaled by their corresponding degrees of freedom--this answers one of your question, about why we don't use directly SSs), which individually follow a $\chi^2$ distribution, is known as the Fisher-Snedecor distribution.

I don't know which software you are using, but

  • If you have R, everything you need for basic modeling is given in the aov() base function ($\eta^2$ might be computed with etasq from the heplots package; and there's a lot more to see for diagnostics and plotting in other packages).
  • If you're more versed into C programming, you may have a look at the apophenia library which features a nice set of statistical functions with bindings for MySQL and Python.