[GIS] Accuracy assessment in R – Calculation of User Accuracy

accuracyland-classificationrremote sensing

I want to assess the accuracy of the land cover classes classified by supervised classification of Landsat images. As reference data I use the aerial photography.

I sampled the classified Landsat data and at the same validation point, I identified the land cover class from aerial photography. I.e., each of my verification points have two attributes: one from Landsat (landsat), second from the aerial photography (reference).

I want to calculate an error matrix (contingency table), and accuracy assessment (overall accuracy, user's producer's accuracy) from the set of verification points.

I found a package greenbrown and asbio to assess the accuracy of classification. My final results of user's and producer's accuracy are howver switched between two packages.

Please, which package correctly calculate the values for user and producer accuracy?

Reproductible example

library(asbio)

# create dummy data
landsat <- c(1, 1, 1, 2, 2, 2, 3, 4, 5, 5, 5, 1, 1, 1, 2, 2, 2, 3, 4, 5, 5, 3, 3, 2, 2)
reference <- c(1, 2, 1, 2, 2, 2, 3, 4, 2, 2, 5, 1, 2, 2, 2, 1, 2, 3, 4, 5, 5, 3, 3, 2, 2)

# calculate Kappa statistics
asbio::Kappa(landsat,reference)  # Kappa(class1, reference)

# check out Kappa results
$ttl_agreement
[1] 76

$user_accuracy
1     2     3     4     5 
75.0  58.3 100.0 100.0 100.0 

$producer_accuracy
1     2     3     4     5 
50.0  87.5 100.0 100.0  60.0 

$khat
[1] 68.1

$table
       reference
class1  1 2 3 4 5
      1 3 3 0 0 0
      2 1 7 0 0 0
      3 0 0 4 0 0
      4 0 0 0 2 0
      5 0 2 0 0 3

# ----------------------------------------------------------------------   
# make the same calculation with the greenbrown package
# ----------------------------------------------------------------------

library(greenbrown)  
library(strucchange)
library(raster)
library(Kendall)
library(plyr)
library(bfast)
library(zoo)

# calculate the contingency table
tab <- table(landsat, reference)   data

# let's see the tab
tab

           reference
landsat   1 2 3 4 5
        1 3 3 0 0 0
        2 1 7 0 0 0
        3 0 0 4 0 0
        4 0 0 0 2 0
        5 0 2 0 0 3

# calculate the accuracy assessement
greenbrown::AccuracyAssessment(tab)

                    1        2   3   4   5 Sum UserAccuracy
1                 3  3.00000   0   0   0   6         50.0
2                 1  7.00000   0   0   0   8         87.5
3                 0  0.00000   4   0   0   4        100.0
4                 0  0.00000   0   2   0   2        100.0
5                 0  2.00000   0   0   3   5         60.0
Sum               4 12.00000   4   2   3  25           NA
ProducerAccuracy 75 58.33333 100 100 100  NA         76.0

The user and producer's accuracy are switched between two packages !
Please, which calculation and estimation of the user's and producer's accuracy is correct?

Best Answer

The User's Accuracy is the reliability of the classes in the classified image. It is calculated as the fraction of correctly classified pixels with respect to all pixels classified as this class in the image. For instance, based on your sample data, the User Accuracy for:

UA_Class1 is : 3 / 3+3 = 50
UA_Class2 is : 7 / 7 + 1 = 87.5

Therefore, to answer your question, it would seem that the User and Producer's accuracies from the asbio package are switched.

For a thorough explanation of the Error (Confusion) Matrix and related calculations, I refer you to this excellent technical note by D. Rossiter. I have included an excerpt that formally defines the UA.

It is expressed from the point of view of the mapper. Looking across the rows (classes as mapped), an error of commission is said to occur when the mapper incorrectly mapped this class at a reference (ground truth) site where it does not exist. That is, the mapper ‘committed’ the error of over-mapping a class. This leads to a lower user’s ‘accuracy’Ci.

Related Solutions

[GIS] Map accuracy assessment by moving window in R

As many on this forum know, I am often for an R solution. However, in this case it is reinventing the wheel, and in a much less robust way. There is a great piece of free software, Map Comparison Kit (MCK), that implements many published and novel validation statistics for rasters. Of particular interest in this case are the Kappa, fuzzy Kappa and weighted Kappa.

Now, if you want to implement something in R there are many approaches you can take that depend on the complexity of the validation statistic. In a univariate case you can easily pass a function to "focal" to calculate uncertainty within a defined neighborhood. Moving into a bivariate case, you would want to vectorize the problem and define a function that would take two independent data into account. I do not believe that "movingFun" or "focal" will take two rasters into account. You can however, use "overlay", "getValuesBlock" or ideally"getValuesFocal" all of which will operate on stack/block objects.

Here is a worked example of calculating Kappa, using a 3x3 window, with "getValuesFocal". In the for loop the lapply function is reclassifying simulated probabilities [p >= t |1| else |0|], The parameter to adjust the sensitivity is "p" and "ws" adjust the size of the focal window extracted. I wrote this to be memory safe so, it writes a file ("Kappa.img") to disk in the defined working directory.

require(raster)
require(asbio)

setwd("D:/TEST")

ws <- 3   # window size
p=0.65    # probability threshold

# Create example data
pred <- raster(ncol=100, nrow=100)
    pred[pred] <- runif(length(pred[pred]),0,1)    
      obs <- pred 
        obs[obs] <- runif(length(pred[pred]),0,1) 
          obs.pred <- stack(obs,pred)
            names(obs.pred) <- c("obs","pred")        

# Create new on-disk raster
s <- writeStart(obs.pred[[1]], "Kappa.img", overwrite=TRUE)  
  tr <-  blockSize(obs.pred)
    options(warn=-1)

    # Loop to read raster in blocks using getValuesFocal  
    for (i in 1:tr$n) {
      # Get focal values as list matrix object
      v <- getValuesFocal(obs.pred, row=tr$row[i], nrows=tr$nrows[i], 
                          ngb=ws, array=FALSE)                
        # reclassify data to [0,1] using lapply                       
        v <- lapply(v, FUN=function(x) {
            if( length(x[is.na(x)]) == length(x) ) {
              return( NA ) 
                } else {              
              return( ifelse(x >= p, 1, 0) ) 
            }
          }
        )   
    # Loop to calculate Kappa and assign to new raster using writeValues
    r <- vector() 
      for( j in 1:dim(v[[1]])[1]) {
        Obs <- v[[1]][j,]
          Obs <- Obs[!is.na(Obs)]       
            Pred <- v[[2]][j,]
              Pred <- Pred[!is.na(Pred)]  
            if( length(Obs) >= 2 && length(Obs) == length(Pred) ) {
              r <- append(r, Kappa(Pred, Obs)$khat)
            } else {
              r <- append(r, NA)
           } 
        }
    writeValues(s, r, tr$row[i])
  }
s <- writeStop(s)       

k <- raster("Kappa.img")
  plot(k)

[GIS] Accuracy assessment in Erdas imagine

When an accuracy assessment is being performed an error matrix would be generated and kappa. From there you would notice that omission and commision error is there from which the producer and user accuracy can be calculated from the formula given below:

Producer Accuracy = 100 - omission error and User Accuracy = 100 - commission error

From the calculated values above if the user accuracy is between 70-100% then you are good to go otherwise you have to reclassify for good results.

Best Answer

Related Solutions

[GIS] Map accuracy assessment by moving window in R

[GIS] Accuracy assessment in Erdas imagine

Related Question