Solved – How to calculate percent agreement and kappa for each category

agreement-statisticscohens-kapparreliability

My aim is to look at the reliability of a coding scheme which employs 7 exhaustive and mutually exclusive categories to categorise 40 "subjects" (i.e. a typical reliability study). More than 2 judges are being used (at the moment I've got 4, but will increase).

I want to look closer at the data to see if particular categories are being employed more reliably than others.

I am using R to analyse the data, and the "IRR" package allows "category-wise kappas" to be calculated, which reports the kappa for each individual category, however the percent agreement would be more useful to me. I'm also not sure how it calculates these numbers, which would be useful to know anyway (I know how kappas are calculated, just not the kappa for the specific codes).

Best Answer

The extended formula (for multiple categories and raters) for observed agreement is as follows: $$ A = \frac{1}{n'}\sum_{i=1}^{n'}\sum_{k=1}^{q}\frac{r_{ik}(r_{ik-1})}{r_i(r_i-1)} $$ where $n'$ is the number of items with two or more ratings, $q$ is the number of categories, $r_{ik}$ is the number of raters who assigned item $i$ to category $k$, and $r_i$ is the number of raters who assigned item $i$ to any category.

If you want to look at the specific agreement for a given category, it is:

$$ A_k = \frac{\sum_{i=1}^{n'}r_{ik}(r_{ik}-1)}{\sum_{i=1}^{n'}r_{ik}(r_i-1)} $$

The interpretation of this score is the probability of a randomly chosen rater assigning an item to that category given that another randomly chosen rater has also assigned that item to that category.

Best Answer

Related Solutions

Reliability – Can Cohen’s Kappa Be Used for Two Judgements Only?

Solved – Adjusting kappa inter-rater agreement for prevalence

Related Question