Solved – Possibility to run multiple chi-square tests on 2×5 table

chi-squared-testhypothesis testingmultiple-comparisons

I'm analyzing data in which I have a dependent variable with two possible outcomes (yes or no) and an independent variable with five emotion conditions, neutral, angry, sad, ashamed, and afraid. The 2×5 table is not statistically significant with a chi-square test, (probably) because the outcomes for some conditions are very close together (e.g., the distribution of yes/no is very similar for ashamed and afraid). However, a pairwise comparison between only certain conditions (e.g., comparing neutral to sadness) does give significant chi-square results.

Is it allowed to do pairwise comparisons with chi-square when overall the test is insignificant? And are post hoc corrections (e.g. Bonferroni) necessary in this case?

Best Answer

This is not really different from the standard case of multiple groups with a continuous variable. If you have an a-priori hypothesis that $B < E$, and don't care about the relationships amongst the other groupings, you can simply run a t-test on those two. In your case, the response variable is binary, so you could simply run a $2\times2$ chi-squared test. If you didn't have such an a-priori hypothesis, you don't want to look at the data and notice that $B$ and $E$ are the furthest apart and just test those. In general, it is best to see if there are any differences amongst the groups with a single (ANOVA or chi-squared) test first. That approach provides some protection against type I error inflation. If you don't have an a-priori hypothesis and you want to skip the omnibus test, you definitely need some strategy to hold familywise type I error rates at an acceptable level (e.g., Bonferroni would be one possibility).

Given more information about your situation, what you want is some form of Dunnett's test. The traditional version of Dunnett's test is for a continuous response variable. In your case, you need to do this with a logistic regression model. This pdf provides information on how to do that. Note that it is written for SAS, but you may be able to adapt the procedures for your chosen software.

Related Solutions

Solved – How to carry out multiple post-hoc chi-square tests on a 2 X 3 table

A contingency table should contain all the mutually exclusive categories on both axes. Inshore/Midchannel/Offshore look fine, however unless "less than 100% mortality" means "100% survival" in this biological setting you may need to construct tables that account for all the cases observed or explain why you restrict your analysis to the extreme ends of the sample.

As 100% survival means 0% mortality, you could have a table with columns 100%=mortality / 100%>mortality>0% / mortality=0%. In this case you wouldn't any more compare percentages, but compare ordinal mortality measures across three site type categories. (What about using the original percentage values instead of categories?) A version of Kruskal-Wallis test may be appropriate here that takes ties appropriately into consideration (maybe a permutation test).

There are established post hoc tests for the Kruskal-Wallis test: 1, 2, 3. (A resampling approach may help tackling with ties.)

Logistic regression and binomial regression may be even better as they not only give you p values, but also useful estimates and confidence intervals of the effect sizes. However to set up those models more details would be needed concerning the 100%>mortality>0% sites.

ANOVA F-Test vs Multiple T-Tests – How Much Smaller Can P-Values Be?

Assuming equal $n$s [but see note 2 below] for each treatment in a one-way layout, and that the pooled SD from all the groups is used in the $t$ tests (as is done in usual post hoc comparisons), the maximum possible $p$ value for a $t$ test is $2\Phi(-\sqrt{2}) \approx .1573$ (here, $\Phi$ denotes the $N(0,1)$ cdf). Thus, no $p_t$ can be as high as $0.5$. Interestingly (and rather bizarrely), the $.1573$ bound holds not just for $p_F=.05$, but for any significance level we require for $F$.

The justification is as follows: For a given range of sample means, $\max_{i,j}|\bar y_i - \bar y_j| = 2a$, the largest possible $F$ statistic is achieved when half the $\bar y_i$ are at one extreme and the other half are at the other. This represents the case where $F$ looks the most significant given that two means differ by at most $2a$.

So, without loss of generality, suppose that $\bar y_.=0$ so that $\bar y_i=\pm a$ in this boundary case. And again, without loss of generality, suppose that $MS_E=1$, as we can always rescale the data to this value. Now consider $k$ means (where $k$ is even for simplicity [but see note 1 below]), we have $F=\frac{\sum n\bar y^2/(k-1)}{MS_E}= \frac{kna^2}{k-1}$. Setting $p_F=\alpha$ so that $F=F_\alpha=F_{\alpha,k-1,k(n-1)}$, we obtain $a =\sqrt{\frac{(k-1)F_\alpha}{kn}}$. When all the $\bar y_i$ are $\pm a$ (and still $MS_E=1$), each nonzero $t$ statistic is thus $t=\frac{2a}{1\sqrt{2/n}} = \sqrt{\frac{2(k-1)F_\alpha}{k}}$. This is the smallest maximum $t$ value possible when $F=F_\alpha$.

So you can just try different cases of $k$ and $n$, compute $t$, and its associated $p_t$. But notice that for given $k$, $F_\alpha$ is decreasing in $n$ [but see note 3 below]; moreover, as $n\rightarrow\infty$, $(k-1)F_{\alpha,k-1,k(n-1)} \rightarrow \chi^2_{\alpha,k-1}$; so $t \ge t_{min} =\sqrt{2\chi^2_{\alpha,k-1}/k}$. Note that $\chi^2/k=\frac{k-1}k \chi^2/(k-1)$ has mean $\frac{k-1}k$ and SD$\frac{k-1}k\cdot\sqrt{\frac2{k-1}}$. So $\lim_{k\rightarrow\infty}t_{min} = \sqrt{2}$, regardless of $\alpha$, and the result I stated in the first paragraph above is obtained from asymptotic normality.

It takes a long time to reach that limit, though. Here are the results (computed using R) for various values of $k$, using $\alpha=.05$:

k       t_min    max p_t   [ Really I mean min(max|t|) and max(min p_t)) ]
2       1.960     .0500
4       1.977     .0481   <--  note < .05 !
10      1.840     .0658
100     1.570     .1164
1000    1.465     .1428
10000   1.431     .1526

A few loose ends...

When k is odd: The maximum $F$ statistic still occurs when the $\bar y_i$ are all $\pm a$; however, we will have one more at one end of the range than the other, making the mean $\pm a/k$, and you can show that the factor $k$ in the $F$ statistic is replaced by $k-\frac 1k$. This also replaces the denominator of $t$, making it slightly larger and hence decreasing $p_t$.
Unequal $n$s: The maximum $F$ is still achieved with the $\bar y_i = \pm a$, with the signs arranged to balance the sample sizes as nearly equally as possible. Then the $F$ statistic for the same total sample size $N = \sum n_i$ will be the same or smaller than it is for balanced data. Moreover, the maximum $t$ statistic will be larger because it will be the one with the largest $n_i$. So we can't obtain larger $p_t$ values by looking at unbalanced cases.
A slight correction: I was so focused on trying to find the minimum $t$ that I overlooked the fact that we are trying to maximize $p_t$, and it is less obvious that a larger $t$ with fewer df won't be less significant than a smaller one with more df. However, I verified that this is the case by computing the values for $n=2,3,4,\ldots$ until the df are high enough to make little difference. For the case $\alpha=.05, k\ge 3$ I did not see any cases where the $p_t$ values did not increase with $n$. Note that the $df=k(n-1)$ so the possible df are $k,2k,3k,\ldots$ which get large fast when $k$ is large. So I'm still on safe ground with the claim above. I also tested $\alpha=.25$, and the only case I observed where the $.1573$ threshold was exceeded was $k=3,n=2$.

Best Answer

Related Solutions

Solved – How to carry out multiple post-hoc chi-square tests on a 2 X 3 table

ANOVA F-Test vs Multiple T-Tests – How Much Smaller Can P-Values Be?

Related Question