Solved – Fleiss kappa in R giving strange results

cohens-kappar

I have an experiment where 4 raters gave their responses to 4 stimuli, and I need to calculate the Fleiss Kappa to check the agreements of the raters. However, I get strange results from the R function implementing the Fleiss analysis.

Participant1 <- c(16, 15, 16, 16)
Participant2 <- c(16, 16, 16, 16)
Participant3 <- c(16, 16, 16, 16)
Participant4 <- c(16, 16, 16, 15)
data <- data.frame(Participant1, Participant2, Participant3, Participant4)
data
library(irr)
kappam.fleiss(data)

The output is

> data
  Participant1 Participant2 Participant3 Participant4
1           16           16           16           16
2           15           16           16           16
3           16           16           16           16
4           16           16           16           15



> kappam.fleiss(data)
 Fleiss' Kappa for m Raters

 Subjects = 4 
  Raters = 4 
   Kappa = -0.143 

        z = -0.7 
  p-value = 0.484 

The value for kappa is negative and with a non-significant p-value, despite a clear agreement between raters. Why?
Personally, I do not really get the answer to the similar question reported here: Strange values of Cohen's kappa

So, why is the Fleiss analysis useful? The results seem to me to not give an indication on how much raters agreed.

How can I simply calculate the agreement between the four raters?

Best Answer

The problem is that there is almost no variation among the raters and the tiny bit of variation that does exist is not in agreement. There are only two ratings that are not 16 and they are for different cases, so you get a negative kappa. That's a correct result. You may, however, want a different measure.