[GIS] Sampling random points from large raster file with replacement using R

rrandomraster

I have numerous large stacked raster file with multiple layers and I would like to select 250 random points and extract the data values from all layers for these points.

I had been using the raster package sampleRandom(raster,n, na.rm=TRUE, xy=TRUE) to select random points before extracting the data from my raster object but this works without replacement. I, however, would like to sample random points from the raster files with replacement as can be done with the r base package sample(x, size, replace = TRUE, prob = NULL).

I had considered just running a loop using sampleRandom so that each point has a chance of being sampled for each of the 250 runs;

samp_all <- matrix(, nrow = 250, ncol = 10)
for (i in 1:250)
{samp <- sampleRandom(layer1, 1, xy = TRUE, na.rm = TRUE)[, -3]
samp_all[i,] <- samp}
samp2 <- extract(stack_all, samp_all)

But this seems to run pretty slowly and I have quite a lot of files I would like to run this on.

Is there a simpler way to randomly sample points from raster layers with replacement that would allow me to extract data from my raster?

Best Answer

Try the following code. The idea is to create a set of [row, col] indicators based on the RasterStack dimensions (rows and cols). Than you can easily used these indicators on the stack to subset all values.

The function below sampleStack gets a RasterStack and n number of values to sample, and gives a data frame with [row, col] positions and values extracted by layer.

library(raster)
# Generate raster layers

r <- raster(matrix(rnorm(100, 0, 1), nrow = 10))
for (i in 1:5) {r <- stack(r, raster(matrix(rnorm(100, 0, 1), nrow = 10)))} # run 5 times

sampleStack <- function(r, n) {
  rowSample <- sample(1:r@nrows, size = n, replace = TRUE)
  colSample <- sample(1:r@ncols, size = n, replace = TRUE)
  pairs <- data.frame("rowInd" = rowSample, "colInd" = colSample)
  out <- as.data.frame(cbind(pairs, as.data.frame(t(apply(pairs, MARGIN = 1, function(x) {return(r[x[1],x[2]])})))))
  colnames(out)[3:ncol(out)] <- names(r)
  return(out)
}

# Example Run
sampleStack(r = r, n = 14)

   rowInd colInd   layer.1.1  layer.2.1   layer.1.2  layer.2.2    layer.1    layer.2
1       5      4 -0.09678111  1.6844843  0.82574090  0.5328165  0.66721846  0.1936958
2       4      8  0.64982724  0.5467126  1.59975344  0.2757094  0.94797866 -0.1798319
3       3      4 -1.35927393 -1.2774878  0.77616160  0.1429519  1.10396643  1.1444793
4       7      9  1.45380719 -0.6128730 -0.53011041 -0.3138787 -0.86586255  0.9056694
5       6      2 -0.49808353 -0.1272448 -1.96004940 -0.5663870 -0.07217682 -1.7568981
6       4     10  0.01607546 -0.5113896  1.19713933 -1.6322803 -1.04051134  0.7135125
7       2      4  1.10798593 -0.2610036  0.56009222  2.4618433 -0.44356484  1.0332427
8       3      1  1.70168812  1.7643488 -0.09976064 -0.2386893 -1.04266622  0.2019014
9      10      3 -1.82791351  1.5666126 -1.79275437  0.2946007  0.96467732 -0.6951626
10      6      4  0.54795051  0.1378088 -1.53793046 -0.5989934 -1.64424273 -0.3463153
11      1      2 -0.86287672 -0.2408750 -0.81438516 -2.0200205  1.16523355  0.4052408
12      2      2 -0.47916254 -0.6778470  0.79086436 -0.5692255  0.96205715 -0.5146865
13      5      4 -0.09678111  1.6844843  0.82574090  0.5328165  0.66721846  0.1936958
14      6      1 -1.04973148 -0.3973457 -0.24445969  0.4061588 -1.50143806  0.3896232