[GIS] Sampling random points from large raster file with replacement using R

rrandomraster

I have numerous large stacked raster file with multiple layers and I would like to select 250 random points and extract the data values from all layers for these points.

I had been using the raster package sampleRandom(raster,n, na.rm=TRUE, xy=TRUE) to select random points before extracting the data from my raster object but this works without replacement. I, however, would like to sample random points from the raster files with replacement as can be done with the r base package sample(x, size, replace = TRUE, prob = NULL).

I had considered just running a loop using sampleRandom so that each point has a chance of being sampled for each of the 250 runs;

samp_all <- matrix(, nrow = 250, ncol = 10)
for (i in 1:250)
{samp <- sampleRandom(layer1, 1, xy = TRUE, na.rm = TRUE)[, -3]
samp_all[i,] <- samp}
samp2 <- extract(stack_all, samp_all)

But this seems to run pretty slowly and I have quite a lot of files I would like to run this on.

Is there a simpler way to randomly sample points from raster layers with replacement that would allow me to extract data from my raster?

Best Answer

Try the following code. The idea is to create a set of [row, col] indicators based on the RasterStack dimensions (rows and cols). Than you can easily used these indicators on the stack to subset all values.

The function below sampleStack gets a RasterStack and n number of values to sample, and gives a data frame with [row, col] positions and values extracted by layer.

library(raster)
# Generate raster layers

r <- raster(matrix(rnorm(100, 0, 1), nrow = 10))
for (i in 1:5) {r <- stack(r, raster(matrix(rnorm(100, 0, 1), nrow = 10)))} # run 5 times

sampleStack <- function(r, n) {
  rowSample <- sample(1:r@nrows, size = n, replace = TRUE)
  colSample <- sample(1:r@ncols, size = n, replace = TRUE)
  pairs <- data.frame("rowInd" = rowSample, "colInd" = colSample)
  out <- as.data.frame(cbind(pairs, as.data.frame(t(apply(pairs, MARGIN = 1, function(x) {return(r[x[1],x[2]])})))))
  colnames(out)[3:ncol(out)] <- names(r)
  return(out)
}

# Example Run
sampleStack(r = r, n = 14)

   rowInd colInd   layer.1.1  layer.2.1   layer.1.2  layer.2.2    layer.1    layer.2
1       5      4 -0.09678111  1.6844843  0.82574090  0.5328165  0.66721846  0.1936958
2       4      8  0.64982724  0.5467126  1.59975344  0.2757094  0.94797866 -0.1798319
3       3      4 -1.35927393 -1.2774878  0.77616160  0.1429519  1.10396643  1.1444793
4       7      9  1.45380719 -0.6128730 -0.53011041 -0.3138787 -0.86586255  0.9056694
5       6      2 -0.49808353 -0.1272448 -1.96004940 -0.5663870 -0.07217682 -1.7568981
6       4     10  0.01607546 -0.5113896  1.19713933 -1.6322803 -1.04051134  0.7135125
7       2      4  1.10798593 -0.2610036  0.56009222  2.4618433 -0.44356484  1.0332427
8       3      1  1.70168812  1.7643488 -0.09976064 -0.2386893 -1.04266622  0.2019014
9      10      3 -1.82791351  1.5666126 -1.79275437  0.2946007  0.96467732 -0.6951626
10      6      4  0.54795051  0.1378088 -1.53793046 -0.5989934 -1.64424273 -0.3463153
11      1      2 -0.86287672 -0.2408750 -0.81438516 -2.0200205  1.16523355  0.4052408
12      2      2 -0.47916254 -0.6778470  0.79086436 -0.5692255  0.96205715 -0.5146865
13      5      4 -0.09678111  1.6844843  0.82574090  0.5328165  0.66721846  0.1936958
14      6      1 -1.04973148 -0.3973457 -0.24445969  0.4061588 -1.50143806  0.3896232

Related Solutions

[GIS] random sampling from spatial points in R

I'd suggest the spatstat package. Perhaps check out the quadratresample function. They also have several others to simulate random patterns that may fit your need (e.g. rstrat and rsyst). Random sampling should be pretty trivial to accomplish on your own (see the sample function in base R)

[GIS] Using sampleRandom() from large raster without NA values in R

Just pad your desired number of random samples and then sample back down to the correct n. This should account for the occasional NA that are produced and subsequently removed with the na.rm=TRUE argument.

    require(raster)
    # Create example data
    r1 <- raster(ncols=500, nrows=500, xmn=0)
      r1[] <- runif(ncell(r1))
    r2 <- raster(ncols=500, nrows=500, xmn=0)
      r2[] <- runif(ncell(r2))  
    r <- stack(r1,r2)

    # Sample size
    n=50

    # Random sample of raster  
    r.samp <- sampleRandom(r, size=(n+20), na.rm=TRUE, sp=FALSE, asRaster=FALSE) 
      dim( r.samp )[1]

   # Create a random sample of n size to subset r.samp
   #   (works with dataframe, matrix and sp objects)
   r.samp <- r.samp[sample( 1:dim(r.samp)[1], n),]
    dim ( r.samp )[1]

If you can read the raster into memory an approach in sp would be to use rgdal to create a SpatialGridDataFrame the coerce it into a SpatialPointsDataFrame so you can easily remove NA's and end up with a point object of your subsample. You can then sample subsequent rasters using this sp point object. The @data dataframe can be extract and coerced into a matrix for your purposes.

require(sp)
require(rgdal)
require(raster)

n=50 # Number of random samples

# Read raster data using rgdal, results in SpatialGridDataFrame 
r <- readGDAL(system.file("external/test.ag", package="sp")[1])
  class(r)
    spplot(r, "band1")

# Coerce into SpatialPointsDataFrame    
r <- as(r, "SpatialPointsDataFrame")      

# remove NA's   
r@data <- na.omit(r.pts@data)
  plot(r, pch=20)

# Create random sample. Object is a SpatialPointsDataFrame     
r.samp <- r[sample(1:dim(r)[1], n),]
  plot(r.samp, pch=20, col="red", add=TRUE)   
    class(r.samp)

#  Use r.samp sp object for additional sampling 
#    Add extra column and coerce to raster stack
r2 <- readGDAL(system.file("external/test.ag", package="sp")[1])
  r2@data <- data.frame(r2@data, band2=runif(dim(r2)[1]) ) 
    r2 <- stack(r2)

# Extract raster values using r.samp object
r.samp@data <- data.frame(r.samp@data, band2=extract(r2[[2]], r.samp))
  str(r.samp@data)

Best Answer

Related Solutions

[GIS] random sampling from spatial points in R

[GIS] Using sampleRandom() from large raster without NA values in R

Related Question