Solved – Comparing proportions between multiple groups – Fisher’s exact test

fishers-exact-testhypothesis testingproportion;rstatistical significance

I have a simple dataset on reproductive success of a certain plant species. Reproductive success was defined as a proportion between number of flowers and number of fruits. We measured on 10 different sites, several seasons. I would like to test if there is a significant difference in RS between sites. An example of my dataset:

enter image description here

I used Fisher's exact test – the same approach as in this example here: Fisher's exact test in R – 2×4 table – as follows:

data <- matrix(c(6, 148, 0, 3, 0, 1, 0, 
         4, 2, 8, 0, 17, 8, 151, 11, 108, 1, 
         33, 0, 2), nrow = 10, byrow = T)
row.names(data) <- c("1", "2", "3", "4", 
         "5", "6", "7", "8", "9", "10")
colnames(data) <- c("fruit YES", "fruit NO")
data
   fruit YES fruit NO
1          6      148
2          0        3
3          0        1
4          0        4
5          2        8
6          0       17
7          8      151
8         11      108
9          1       33
10         0        2
fisher.test(data)

    Fisher's Exact Test for Count Data

data:  data
p-value = 0.3329
alternative hypothesis: two.sided

The result shows that there is no significant difference between sites, but if you check site no. 5 in the data, the percentage of fruit is much higher than the rest. Did I use the right test? If I did – did I do it right?
Would you suggest any other method?
Additional question: I would also like to check if the number of flowers and pH affect the production of fruits on each site. Which test/method should I use in this case – logistic regression? I'm very new to R, so a more detailed explanation would be very very appreciated.

Best Answer

The result shows that there is no significant difference between sites, but if you check site no. 5 in the data, the percentage of fruit is much higher than the rest.

True, however, you have only 2 "yes" and 8 "no", that is why the difference is not significant.

Did I use the right test? If I did - did I do it right? Would you suggest any other method?

The Fisher exact test is appropriate for your data and I have no suggestion of alternatives. Since I'm not an expert in R I can't tell if it was correctly applied.

Additional question: I would also like to check if the number of flowers and pH affect the production of fruits on each site. Which test/method should I use in this case - logistic regression? [...]

Yes, the logistic regression is appropriate, considering that the fruit variable is dichotomous (yes/no). As independent variables you should consider the site as nominal, using the first or the seventh category as reference, and the pH as continuous variable.