[GIS] Is Clip (Data management) or Extract By Mask (Spatial Analyst) more efficient

arcgis-desktopclipgeoprocessingmaskingspatial-analyst

I'm trying to clip down an Orthophoto to a county boundary. I've been using the Extract By Mask Tool, but this process has been running for about 12 hours! It's gone through 2 cycles already, so I'm assuming this is each band. Can someone correct me if I'm wrong on that.

Would the Clip in Data Management work better (faster)? How would the accuracy be?

I've always tried to use Extract By Mask, but this is taking too long.

Best Answer

I ran a test to determine how the speed and quality differs between the two methods, here are the results:

Input data

4-band NAIP DOQQ image in .img format (349.34MB)
A feature class used as the mask/clipper

enter image description here

Performance

Three trials were performed and benchmarked. The Clip (Data Management) method is significantly faster than the Extract by Mask (Spatial Analyst) method.

enter image description here

Quality

Both extents were identical as were NoData values. However, a visual assessment showed that the extract by mask method slightly altered the pixel arrangement--likely the result of some type of resampling. The pixel arrangement in the clip operation were identical to the original input image.

# Import system modules
import arcpy, time, os
from arcpy import env
from arcpy.sa import *

env.overwriteOutput = 1

# Check out the ArcGIS Spatial Analyst extension license
arcpy.CheckOutExtension("Spatial")

# Set local variables
inRaster = r'C:\temp\naip2011.img'
inMaskData = r'C:\temp\fgdb.gdb\clipper'
outws = r'C:\temp'
out1 = os.path.join(outws, 'extractbymask.img')
out2 = os.path.join(outws, 'clip.img')

#############TRIAL 1: EXTRACT BY MASK#########################################
start = time.clock()

# ExtractByMask
outExtractByMask = ExtractByMask(inRaster, inMaskData)

# Save the output
outExtractByMask.save(out1)

end = time.clock()
total = end - start

print "The extract by mask method took:  %s seconds" % round(total, 3)

###############TRIAL 2: CLIP#################################################
start = time.clock()

# Clip
arcpy.Clip_management(inRaster, "#", out2, inMaskData, "", "ClippingGeometry")

end = time.clock()
total = end - start

print "The clip method took:             %s seconds" % round(total, 3)

Related Solutions

QGIS Raster Masking – Clipping Raster Layer with Raster Mask Layer in QGIS

The simple way to do this in QGIS is to use the Raster Calculator (Raster->Raster Calculator). You have a couple of options. The easiest to explain/understand is to make a unitary raster from your mask (all data set to either 1 or NoData) and then multiply your clip layer by the unitary mask layer.

To ensure the extents match the mask layer, in the raster calculator window select the mask layer in the 'Raster bands' list on the left and then click the 'Current layer extent' button on the right.

You can create a unitary mask on the fly by using a conditional statement (see the link) something like this:

(maskLayer@1 >= 0) * clipLayer@1

This statement basically says: treat everything in my mask layer that is not NoData as being equal to 1 (NoData stays as NoData). Just be sure to remember to set the extent (see above)!

Raster Calculator and paletted data:

Any output from the raster calculator will be just values and not carry over any information contained within a colour pallet. You have a couple of options to 'get your palette back':

Go to the style tab of the original layer's properties and (making sure it is paletted), click on Save Style button at the bottom. You can then apply this same style to your new layer and if you have only clipped it as per the above instructions, it will appear the same as before. This is easy but your raster is not paletted in a persistent way outside of the QGIS project.
To make the presentation persistent, you can right-click the layer and 'Save As'. Check the 'Rendered Image' radio button at the top of the save-dialog box. This will make a new raster with the exact colour scheme as your palatted layer BUT it will no longer be a single band raster but a four band RGBA raster.
If you MUST have an actual stand-alone single-band raster with a persistent palette (and not RGBA), then I don't know of a way of doing this in QGIS but you could take the output of step 2 above and open it in GIMP (or Photoshop) and change the mode from RGB to Indexed Colour. But you will need to create a world file as saving the image in GIMP/Photoshop will destroy the georeferencing.

R Raster Processing – Increasing Speed of Crop, Mask, & Extract by Polygons

I have finally gotten around to improving this function. I found that for my purposes, it was fastest to rasterize() the polygon first and use getValues() instead of extract(). The rasterizing isn't much faster than the original code for tabulating raster values in small polygons, but it shines when it came to large polygon areas that had large rasters to be cropped and the values extracted. I also found getValues() was much faster than the extract() function.

I also figured out the multi-core processing using foreach().

I hope this is useful for other people who want an R solution for extracting raster values from a polygon overlay. This is similar to the "Tabulate Intersection" of ArcGIS, which did not work well for me, returning empty outputs after hours of processing, like this user.

#initiate multicore cluster and load packages
library(foreach)
library(doParallel)
library(tcltk)
library(sp)
library(raster)

cores<- 7
cl <- makeCluster(cores, output="") #output should make it spit errors
registerDoParallel(cl)

Here's the function:

multicore.tabulate.intersect<- function(cores, polygonlist, rasterlayer){ 
  foreach(i=1:cores, .packages= c("raster","tcltk","foreach"), .combine = rbind) %dopar% {

    mypb <- tkProgressBar(title = "R progress bar", label = "", min = 0, max = length(polygonlist[[i]]), initial = 0, width = 300) 

    foreach(j = 1:length(polygonlist[[i]]), .combine = rbind) %do% {
      final<-data.frame()
      tryCatch({ #not sure if this is necessary now that I'm using foreach, but it is useful for loops.

        single <- polygonlist[[i]][j,] #pull out individual polygon to be tabulated

        dir.create (file.path("c:/rtemp",i,j,single@data$OWNER), showWarnings = FALSE) #creates unique filepath for temp directory
        rasterOptions(tmpdir=file.path("c:/rtemp",i,j, single@data$OWNER))  #sets temp directory - this is important b/c it can fill up a hard drive if you're doing a lot of polygons

        clip1 <- crop(rasterlayer, extent(single)) #crop to extent of polygon
        clip2 <- rasterize(single, clip1, mask=TRUE) #crops to polygon edge & converts to raster
        ext <- getValues(clip2) #much faster than extract
        tab<-table(ext) #tabulates the values of the raster in the polygon

        mat<- as.data.frame(tab)
        final<-cbind(single@data$OWNER,mat) #combines it with the name of the polygon
        unlink(file.path("c:/rtemp",i,j,single@data$OWNER), recursive = TRUE,force = TRUE) #delete temporary files
        setTkProgressBar(mypb, j, title = "number complete", label = j)

      }, error=function(e){cat("ERROR :",conditionMessage(e), "\n")}) #trycatch error so it doesn't kill the loop

      return(final)
    }  
    #close(mypb) #not sure why but closing the pb while operating causes it to return an empty final dataset... dunno why. 
  }
}

So to use it, adjust the single@data$OWNER to fit with the column name of your identifying polygon (guess that could have been built into the function...) and put in:

myoutput <- multicore.tabulate.intersect(cores, polygonlist, rasterlayer)

Best Answer

Related Solutions

QGIS Raster Masking – Clipping Raster Layer with Raster Mask Layer in QGIS

R Raster Processing – Increasing Speed of Crop, Mask, & Extract by Polygons

Related Question