Solved – Weighted Fleiss’ Kappa for Interval Data

agreement-statisticscohens-kappa

I am looking for a variant of Fleiss' Kappa to deal with interval data, rather than strictly nominal/ordinal data. The context that I intend to use it in is as follows:

  • There are several (5-8) graders grading a total of 16 exams
  • The exams are identical, and contain 7 questions.
  • Each question is graded out of 3-8, depending on the question
  • Every exam is graded by every grader (though there are spots of missing data)

Please help me find where to look! I have seen an occasional internet utterance of a weighted Fleiss' Kappa, but never a reference for it. I have hope that it exists because a weighted Cohen's Kappa is used frequently. References to relevant R packages would also be appreciated (I have used the irr package before).

Best Answer

Here is the generalized formula for Fleiss' kappa: $$r_{ik}^\star = \sum_{l=1}^q w_{kl}r_{il}$$ $$p_o = \frac{1}{n'}\sum_{i=1}^{n'}\sum_{k=1}^{q}\frac{r_{ik}(r_{ik}^\star-1)}{r_i(r_i-1)}$$ $$\pi_k = \frac{1}{n}\sum_{i=1}^n\frac{r_{ik}}{r_i}$$ $$p_c = \sum_{k,l}^q w_{kl} \pi_k \pi_l$$ $$\widehat{\kappa} = \frac{p_o-p_c}{1-p_c}$$ where $q$ is the total number of categories,
$w_{kl}$ is the weight associated with two raters assigning an item to categories $k$ and $l$,
$r_{il}$ is the number of raters that assigned item $i$ to category $l$,
$n'$ is the number of items that were coded by two or more raters,
$r_{ik}$ is the number of raters that assigned item $i$ to category $k$,
$r_i$ is the number of raters that assigned item $i$ to any category,
and $n$ is the total number of items.

Here is the formula for calculating interval weights: $$w_{kl} = \begin{cases}1-\frac{|x_k-x_l|}{x_{max}-x_{min}} & \text{if } k \neq l\\1 & \text{if }k=l\end{cases}$$ where the weight of any two categories $k$ and $l$ is equal to $1$ minus the distance between these categories divided by the maximum distance between any two possible categories.

I have more information on my mReliability website, including the mSCOTTPI function which will calculate the Fleiss' kappa coefficent in MATLAB. Or if you prefer R, you can use the fleiss.kappa.raw function from the agree.coeff3.raw.r package, albeit with less documentation.