As many on this forum know, I am often for an R solution. However, in this case it is reinventing the wheel, and in a much less robust way. There is a great piece of free software, Map Comparison Kit (MCK), that implements many published and novel validation statistics for rasters. Of particular interest in this case are the Kappa, fuzzy Kappa and weighted Kappa.
Now, if you want to implement something in R there are many approaches you can take that depend on the complexity of the validation statistic. In a univariate case you can easily pass a function to "focal" to calculate uncertainty within a defined neighborhood. Moving into a bivariate case, you would want to vectorize the problem and define a function that would take two independent data into account. I do not believe that "movingFun" or "focal" will take two rasters into account. You can however, use "overlay", "getValuesBlock" or ideally"getValuesFocal" all of which will operate on stack/block objects.
Here is a worked example of calculating Kappa, using a 3x3 window, with "getValuesFocal". In the for loop the lapply function is reclassifying simulated probabilities [p >= t |1| else |0|], The parameter to adjust the sensitivity is "p" and "ws" adjust the size of the focal window extracted. I wrote this to be memory safe so, it writes a file ("Kappa.img") to disk in the defined working directory.
require(raster)
require(asbio)
setwd("D:/TEST")
ws <- 3 # window size
p=0.65 # probability threshold
# Create example data
pred <- raster(ncol=100, nrow=100)
pred[pred] <- runif(length(pred[pred]),0,1)
obs <- pred
obs[obs] <- runif(length(pred[pred]),0,1)
obs.pred <- stack(obs,pred)
names(obs.pred) <- c("obs","pred")
# Create new on-disk raster
s <- writeStart(obs.pred[[1]], "Kappa.img", overwrite=TRUE)
tr <- blockSize(obs.pred)
options(warn=-1)
# Loop to read raster in blocks using getValuesFocal
for (i in 1:tr$n) {
# Get focal values as list matrix object
v <- getValuesFocal(obs.pred, row=tr$row[i], nrows=tr$nrows[i],
ngb=ws, array=FALSE)
# reclassify data to [0,1] using lapply
v <- lapply(v, FUN=function(x) {
if( length(x[is.na(x)]) == length(x) ) {
return( NA )
} else {
return( ifelse(x >= p, 1, 0) )
}
}
)
# Loop to calculate Kappa and assign to new raster using writeValues
r <- vector()
for( j in 1:dim(v[[1]])[1]) {
Obs <- v[[1]][j,]
Obs <- Obs[!is.na(Obs)]
Pred <- v[[2]][j,]
Pred <- Pred[!is.na(Pred)]
if( length(Obs) >= 2 && length(Obs) == length(Pred) ) {
r <- append(r, Kappa(Pred, Obs)$khat)
} else {
r <- append(r, NA)
}
}
writeValues(s, r, tr$row[i])
}
s <- writeStop(s)
k <- raster("Kappa.img")
plot(k)
The User's Accuracy is the reliability of the classes in the classified image. It is calculated as the fraction of correctly classified pixels with respect to all pixels classified as this class in the image. For instance, based on your sample data, the User Accuracy for:
UA_Class1 is : 3 / 3+3 = 50
UA_Class2 is : 7 / 7 + 1 = 87.5
Therefore, to answer your question, it would seem that the User and Producer's accuracies from the asbio
package are switched.
For a thorough explanation of the Error (Confusion) Matrix and related calculations, I refer you to this excellent technical note by D. Rossiter. I have included an excerpt that formally defines the UA.
It is expressed from the point of view of the mapper. Looking across the rows (classes as mapped), an error of commission is said to occur when the mapper incorrectly mapped this class at a reference (ground truth) site where it does not exist. That is, the mapper ‘committed’ the error of over-mapping a class. This leads to a lower user’s ‘accuracy’Ci.
Best Answer
kappa does not quantifies the level of agreement between two datasets. It represents the level of agreement of two dataset corrected by chance.
The reason why you have a large difference between kappa and overall accuracy is that one of the classes (class 1) accounts for the large majority of your map, and this class is well described. Overall accuracy is therefore an optimistic index of the classifier performance, even if it is the true "agreement" in your case. As a trivial example, if I give you a map that says "class 1" everywhere, it will be 99% correct. Similarly, if 99% of the pixels are randomly assigned to "class 1", the resulting map will still have a large agreement with your map. This is what kappa penalize with its "c" in the expression below (note that there are different kappa's, here is the most common).
kappa = (OA-c)/(1-c), where e is the overall probability of random agreement
On your confusion matrix, you can see that classes 5 and 6 are always wrong and class 2 is not very reliable. This will have a large impact on your kappa index and this explains the large difference. The classifier is not better than chance for these classes.
As a remark, standard OA and kappa DO NOT take the distance between classes into account, so the fact that classes 5 and 6 are far off does not affect your results for any of those indices. Therefore, I suggest that you take advantage of the fact that your classes refer to quantities. The correlation between the two map could therefore make a good indicator. A confusion between 1 and 6 would then have more importance than a confusion between 1 and 2. Another way is to look at each class individually (user and producer accuracies).
I do not agree on the fact that Kappa is largely considered to be more robust than OA. According to Pontius (2011), kappa has not provided the useful information that it is supposed to bring.
EDIT : More recently, Olofsson, Foody, Herold, Stehman, Woodcock and Wulder (2014, Remote sensing of Environment) also advocated against kappa. Considering the importance of those authors, I would follow their recommendations.