Statistical Analysis – Interpreting Contingency Table Proportions Between Two Choices with p-Value of 1

chi-squared-testcontingency tablesproportion;r

I have survey data, where participants where asked to choose two times between yes and no
For this example let's say (although I am note quite sure if this is a good example).

Choice1: Do you want to take to be acknowledge by your coworkers?
Choice2: Do you want to be responsible if something goes wrong?

So now we found considerable differences in the amount of yes and no to the two questions.

pacman::p_load(tidyverse)
# MRE Data
> df
       Choice2
Choice1 No Yes
    No   6   1
    Yes 22   6

# dput 
structure(c(6L, 22L, 1L, 6L), .Dim = c(2L, 2L), .Dimnames = list(
    Choice1 = c("No", "Yes"), Choice2 = c("No", "Yes")), class = "table")

I am interested if these diferences are significant. So if e.g. substantially more
participants chose yes for Choice1 then for Choice2.
I thought I could analyze this with a $\chi^2$ Test?

However due to the small sample (and the small expected cell count) I got a warning, from chisq.test, so instead I conducted a Fisher's exact test.

# Chi² Test of Independence
chi <- chisq.test(df)
chi
# Exepected Cell Count
chi$expected
# Due to small expected cell count and Warning: Chi-squared approximation may be incorrect
# Instead conduct a fisher exact test
fisher.test(df)


Fisher's Exact Test for Count Data
data:  df
p-value = 1
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
  0.1437625 87.6035655
sample estimates:
odds ratio 
  1.615585

What strikes me about the result is a p value of 1.

Looking at the proportions for Choice 1 80% voted no, for Choice 2 20%.
This appears to be reasonable differences.

# Print proportions
df %>% 
  rbind("Prop" =(prop.table(df) %>% colSums() *100)) %>% 
  cbind("Prop" = c((prop.table(df) %>% rowSums() *100),100))

# Choice 1: No = 80%, Choice 2: No = 20% how is p = 1 

     No Yes Prop
No    6   1   20
Yes  22   6   80
Prop 80  20  100

No I am wondering if I am even using the right test? I know that the $\chi^2$ is a test of independence. So the H1 would be that Choice1 and Choice2 are dependent. However I am rather interested in knowing if the propotions between Choice1 and Choice2 are meaningful different.
And how does it come that I get a p = 1

Edit Created the differ Variable

> df
# A tibble: 35 x 3
   Choice1 Choice2 differ
   <fct>   <fct>    <dbl>
 1 Yes     No           1
 2 Yes     No           1
 3 Yes     No           1
 4 No      No           0
 5 No      No           0
 6 Yes     No           1
 7 Yes     Yes          0
 8 Yes     No           1
 9 Yes     Yes          0
10 Yes     No           1
# ... with 25 more rows
> df %>% dput()
structure(list(Choice1 = structure(c(2L, 2L, 2L, 1L, 1L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 
2L, 1L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 1L, 2L, 2L), .Label = c("No", 
"Yes"), class = "factor"), Choice2 = structure(c(1L, 1L, 1L, 
1L, 1L, 1L, 2L, 1L, 2L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 1L, 
1L, 2L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L
), .Label = c("No", "Yes"), class = "factor"), differ = c(1, 
1, 1, 0, 0, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 1, 0, 1, 1, 1, 1, 1, 
0, 0, 1, 1, 1, 1, 1, 0, 1, 1, 0, 1, 0)), class = c("tbl_df", 
"tbl", "data.frame"), row.names = c(NA, -35L))

Solution Edit:

In addition to the provided answer, I would like to point attention to the very helpful question that @Scortchi linked in the comments (see here). The answer provided by Gung really improved my understanding and helped me navigate. The correct test for my question would be either the binominal test (as mentioned in the accepted answer) or the McNemmar $\chi^2$ test. Please refer to the link for more details on the reasoning behind.

Best Answer

You have set up your data to test independence. If you want to compare the proportions you need the rows labelled with the choices and the columns with the responses. However if this is the same people measured twice then only people whose responses differ are informative so you need just those two numbers and then do a binomial test.

Edit Created the differ Variable

Solution Edit:

Best Answer

Related Solutions

Solved – Given the power of computers these days, is there ever a reason to do a chi-squared test rather than Fisher’s exact test

Contingency Tables – Tests to Do and When to Use Them

Related Question