R Agreement Statistics – Calculating Interrater Reliability for Multiple Trials and Raters

agreement-statisticsr

I have data for a large number of Trials (only three shown here) and ratings by subjects A, B, C, D, and E(many more in the actual data). In each Trial subjects were asked to determine whether event f or event n occurred:

df <- structure(list(Trial = 1:3, Trial_time = c("00:00:00.001", "00:00:00.002", 
"00:00:00.003"), A = c("f", "n", "n"), B = c("f", "n", "f"), 
    C = c("f", "f", "n"), D = c("f", "f", "n"), E = c("f", "f", 
    "n")), row.names = c(NA, -3L), class = c("tbl_df", "tbl", 
"data.frame"))

How can I establish an interrater reliability score for this kind of rating in R? Help is much appreciated!

Best Answer

You can use a chance-adjusted index of categorical reliability. There are many packages and functions that will do this. For instance, see Gwet's irrCAC package or my agreement package. You would treat trials as "objects" of measurement and raters as "sources" of measurement.