Solved – Significant Fisher’s exact test, post hoc analysis for subgroup comparisons

contingency tablesfishers-exact-testpost-hoc

I'm analysing 2 groups of patients with 2 different DISEASE_STAGES: MILD disease and MODERATE disease, as defined by a complex clinical diagnosis. The sample size is relatively small: a total of 80 patients characterised by SMOKING_STATUS with 3 levels: active smoker, ex-smoker and never smoked.

I've performed a Fisher exact test because one cell has a frequency of 1

fisher.test(matrix(c(1, 5, 14, 3, 33, 22), nrow=2, ncol=3, byrow=TRUE))

Fisher's Exact Test for Count Data
p-value = 0.03039
alternative hypothesis: two.sided

I reject the null hypothesis that the disease is not affected by smoking status.

My question: Is it possible and how can I perform a post-hoc analysis with pairwise comparisons of the proportions for a Fisher exact test? How should I correct p-values to account for the multiple testing (what kind of statistical significance should I accept for these subgroup comparisons)?

Best Answer

After giving it some thought, I think the best approach is to combine the categories of active smoker and ex smoker into "exposed to smoke" unless there is a good clinical reason to suspect that actively smoking is different than having smoked. Combining the categories alleviates the troubles of multiple comparisons as well as the category with small sample size.

You could look to see what other people have done with respect to this problem. A quick google reveals a paper in PLOS one about post hoc and fisher tests. I've not read that paper, so I can't comment on its relevance. In any case I think a reviewer would look at that first category and take issue with the fact that you are making comparisons with so few observations.

I would also suggest making friends with a biostatistician if you have not already done so.

Related Solutions

Solved – Fisher’s exact test in 3×2 contingency table

It sounds like you are asking a lot of different questions here.

My question is: how should I interpret the p value? I don't understand what is that referred to.

The null hypothesis for Fisher's Exact test is that the groups do not affect the outcome, i.e. that they are independent. Rejection of the null hypothesis indicates the outcome (a, b, or c) is dependent on group.

fisher.test(matrix(c(2, 12, 1, 5, 3, 1), 
            nrow=2, ncol=3, byrow=TRUE))
Fisher's Exact Test for Count Data

data:  dta
p-value = 0.05082
alternative hypothesis: two.sided

In this case your $p$ value is approximately 0.05082. I will let you decide whether to reject the null.

Having the p value, how can I say that one of the three forms is statistically significant more represented than the others (if true)?

This is a separate question and I'm not sure what you are trying to ask.

Solved – Fisher’s exact test vs kappa analysis

I know I answer the question two years later, but I hope some future readers may find the answer helpful.

Cohen's $\kappa$ tests if there are more chance that a datum falls in the diagonal of a classification table whereas Fisher's exact test evaluates the association between two categorical variables.

In some cases, Cohen's $\kappa$ might appear to converge to Fisher exact test. A simple case, will answer your question that the Fisher test is not appropriate for rater agreement.

Imagine a $2 \times 2$ matrix like

$\begin{matrix} 10 & 20 \\ 20 & 10\end{matrix}$.

It is clear that there is an association between both variables on the off-diagonal, but that raters do not agree more than chance. In other terms, raters systematicaly disagree. From the matrix, we should expect that the Fisher test is significant while the Cohen's $\kappa$ should not be. Carrying the analysis confirms the expectation, $p = 0.01938$ and $\kappa = -0.333$, $z =-4743$ and $p = 0.999$.

We can also carry another example where both outcomes diverge with the following matrix :

$\begin{matrix} 20 & 10 & 10 \\ 20 & 20 & 20 \\ 20 & 20 & 20 \end{matrix}$,

which gives $p = 0.4991$ and $\kappa = 0.0697$, $z =1.722$ and $p = 0.043$. So the raters likely agree, but there is no relation between categorical variables.

I don't have a more formal mathematical explanation on how they should or should not converge though.

Finally, given the actual state of knowledge on Cohen's $\kappa$ in the methodological literature (see this for instance), you might want to avoid it as a measure of agreement. The coefficient has a lot of issus. Careful training of raters and strong agreement on each categories (rather than the overall agreement) is, I believe, the way to go.

Best Answer

Related Solutions

Solved – Fisher’s exact test in 3×2 contingency table

Solved – Fisher’s exact test vs kappa analysis

Related Question