[GIS] How are kappa and overall accuracy related with respect to thematic raster data

accuracyrrasterstatistics

I recently calculated kappa (using confusionMatrix{caret} in R) between a thematic raster dataset at 1m spatial resolution and another at 30m spatial resolution. I was interested in quantifying the level of agreement between the two thematic raster datasets. Random points were taken at the same location for each thematic raster and tabulated. Thematic classes range from 1 – 6, which actually relates to the level of canopy cover (1 (low) – 6 (high)). The following is the resulting error matrix:

     30m
1m    1    2    3    4    5    6
1    9906  21   0    0    1    3
2     13   36   0    0    0    0
3     0    0    0    0    0    0
4     0    0    0    0    0    0 
5     0    0    0    0    0    0
6     0    0    0    0    0    0

Strangely, the overall accuracy and kappa seem to be at odds with each other:

Accuracy = 0.9962
Kappa    = 0.6649

What might be causing this discrepancy between Kappa and overall accuracy? My best guess is that the confusion matrix is stacked toward the lower canopy cover ranges (1 and 2) while there are a few far-off errors at the higher canopy cover classes (3 – 6).

Best Answer

kappa does not quantifies the level of agreement between two datasets. It represents the level of agreement of two dataset corrected by chance.

The reason why you have a large difference between kappa and overall accuracy is that one of the classes (class 1) accounts for the large majority of your map, and this class is well described. Overall accuracy is therefore an optimistic index of the classifier performance, even if it is the true "agreement" in your case. As a trivial example, if I give you a map that says "class 1" everywhere, it will be 99% correct. Similarly, if 99% of the pixels are randomly assigned to "class 1", the resulting map will still have a large agreement with your map. This is what kappa penalize with its "c" in the expression below (note that there are different kappa's, here is the most common).

kappa = (OA-c)/(1-c), where e is the overall probability of random agreement

On your confusion matrix, you can see that classes 5 and 6 are always wrong and class 2 is not very reliable. This will have a large impact on your kappa index and this explains the large difference. The classifier is not better than chance for these classes.

As a remark, standard OA and kappa DO NOT take the distance between classes into account, so the fact that classes 5 and 6 are far off does not affect your results for any of those indices. Therefore, I suggest that you take advantage of the fact that your classes refer to quantities. The correlation between the two map could therefore make a good indicator. A confusion between 1 and 6 would then have more importance than a confusion between 1 and 2. Another way is to look at each class individually (user and producer accuracies).

I do not agree on the fact that Kappa is largely considered to be more robust than OA. According to Pontius (2011), kappa has not provided the useful information that it is supposed to bring.

EDIT : More recently, Olofsson, Foody, Herold, Stehman, Woodcock and Wulder (2014, Remote sensing of Environment) also advocated against kappa. Considering the importance of those authors, I would follow their recommendations.

The problems associated with kappa include but are not limited to: 1) the correction for hypothetical chance agreement produces a measure that is not descriptive of the accuracy a user of the map would encounter (kappa would underestimate the probability that a randomly selected pixel is correctly classified); 2) the correction for chance agreement used in the common formulation of kappa is based on an assumption of random chance that is not reasonable because it uses the map marginal proportions of area in the definition of chance agreement and these proportions are clearly not simply random; and 3) kappa is highly correlatedwith overall accuracy so reporting kappa is redundant with overall accuracy.” (Foody, 1992; Liu et al., 2007; Pontius &Millones, 2011; Stehman, 1997). Consistentwith the recommendation in Strahler et al. (2006) the use of kappa is strongly discouraged as, despite its widespread use, it actually does not serve a useful role in accuracy assessment or area estimation.

Related Question