Solved – 2-way ANOVA multiple comparisons question

anovamultiple-comparisonsp-valuepost-hoc

My question is about a 2-way ANOVA for an experiment. For simplicity let's say I'm using 3 methods to stimulate cells at 3 different time points, and I want to look at a resulting output variable. Different cells were stimulated at each time point, so independence of observations is maintained. Each time point has 9 – 10 replicates. The data look like this: https://i.imgur.com/oGEagOY.png

I ran a 2-way ANOVA, and found both stimulation method and time to be significant. The multiple comparisons options in Graphpad though are confusing me. I could compare each value to each other value, but this results in irrelevant comparisons being made that decrease significance levels. The comparisons I'm interested in are:

1) The difference in mean output value between each time point. Prism calls this the main row effect.

2)The difference in mean output value between each stimulation method. Prism calls this the main column effect.

3)The difference in output values across stimulation methods within each time point (e.g. comparing day 7 method 1 vs method 2 vs method 3). Prism calls this simple effect within columns.

4)The difference in output values across time points within each stimulation (e.g. comparing stimulation method 1 day 7 vs day 10 vs day 14). Prism calls this a simple effect within rows.

My question is, can I run 4 multiple comparisons using each of these settings, and then use the p-values that are returned?

I would think certainly not, since doing them separately gives a way inflated p-value. So, I tried a similar 2-way ANOVA on both SPSS and Graphpad Prism. These data had no replicates, so I only wanted to compare the columns to each other and the rows to each other. On Prism, I ran the main row effect and main column effect multiple comparisons separately. To my surprise the Prism output matched the SPSS output. But in a 2-way ANOVA post-hoc shouldn't every row mean be compared to not only the other row means but the column means as well?

Sorry for the length. This is my first question so just let me know if I violated any rules and I'll edit accordingly.

Best Answer

Extended Comment: It would be helpful to know P-values for Time, Method, and Time*Method interaction.

Look at Interaction first: It is possible that certain patterns among interactions may affect how differences among Times and differences among Methods are assessed.

A typical way to judge the impact of (significant) interaction is to assess significance of 'orthogonal contrasts'. Considering that you have nine levels of Method (ignoring Method 10, which has only partial data) and three levels of Time, there are $(9-1)(3-1) = 16$ mutually orthogonal contrasts. Ordinarily, that would be more than enough to explore interesting differences.

You may have had some of these contrasts in mind from the start. (For example, you may have expected Methods 1,2,3 to work better if administered early, and other Methods to work better if administered later on.) If you had this in mind before seeing the data, you can look at several of them according to the standards for judging 'pre-chosen' contrasts. Any additional contrasts suggested by the data, should be assessed using a method for 'ad hoc' contrasts (e.g., 'Scheffe's method'). Those methods should keep you from 'false discovery' of effects from artifacts of your particular data.

Then look at comparisons of levels of each main effect: With meaningfully large interaction effects in mind, you can turn to assessing differences among Times and differences among Methods. For each main effect you can keep the 'family error rate' for comparisons among its levels by using a multiple-comparison method such as 'Tukey's HSD'.

Notes: (1) I take the data in your link to be averages of "9 or 10" replications at each 'cell' (combination of levels of Time and Method). Without knowing the variability within cells, it is not possible to judge the level of significance of Interaction. And because the detective work unraveling the meaning of results from a two-factor ANOVA must begin by assessing Interaction, I can have no informed hunch where the suggestions above may lead.

However, in my experience, researchers often tend to design experiments in such a way that interaction effects are not statistically significant or (even if significant) not large enough to be of practical importance. If that is true with your experiment, you might be able to go directly to assessing the significance and importance of the main effects.

(2) For more detail, you can search this site or the Internet for words or phrases I have put in 'single quotes'.