[GIS] How to get rid of edge effects while using focal in R to smooth a raster

rraster

I am trying to use R with the focal function from the raster package to smooth raster images.

I am working with Radar images that have already been preprocessed. As part of that, some calculations introduce unrealistic values that are then deleted from the raster.
So basically, I am working with images with values ranging from -2 (black) to 7 (white) like this and want to smooth over the small datagaps:

To do so, I tried the focal function from the raster package with the following code:

raster1 <- raster("example_raster.tif")

mat <- matrix(1/25,ncol=5, nrow=5)
raster_res <- focal(raster1, mat, FUN="mean", na.rm=T) 

writeRaster(raster_res, filename="smooth_raster.tif", format="GTiff")

The problem with this is, that the NAs seem to introduce some kind of edge effect. na.rm is set to TRUE to exclude the NAs from the actual calculations, but if you look at the edge at the east or how spotty the areas around former NA values appear, it looks like R still somehow includes them in the moving window calculations, somehow influencing the sourrounding values.
Here is what the output looks like:

To check, if this is a normal focal-problem, I also used r.neighbors in the QGIS processing toolbox set to a window size of 5 to smooth my data and it does exactly what I expected. Unfortunately I can't post a third link due to being new, but the raster is very smooth and does not have the spotty appearence of the output R produces.

I already tried using the pad and padValue arguments in focal but did not manage to make it work the way I want. Unfortunately, I don't know enough about GRASS to understand, what exactly r.neighbors does differently than focal.
Since this will be part of a script that automatically processes several rasters, doing everything manually with QGIS is not really an option.

Best Answer

There is a couple of things wrong with this:

mat <- matrix(1/25,ncol=5, nrow=5)
raster_res <- focal(raster1, mat, FUN="mean", na.rm=T)

The argument to supply the function is called fun, not FUN. You give each value a weight of 1/25, and then you want to use "mean". However, you should use "sum" in that case! (which is the default, and because you were using FUN instead of fun, that is what happened anyway.) However, since you have missing values and you are using na.rm=TRUE, you really need to give each value a weight of 1 and then use mean.

Also, as you want to fill in missing values, not change existing values, I would use NAonly = TRUE, together with pad=TRUE (to pad virtual rows and columns with NAs outside of the raster).

Here is an example:

library(raster)

# example data
logo <- raster(system.file("external/rlogo.grd", package="raster")) 
set.seed(0)
i <- sample(ncell(logo), 200)
logo[i] <- NA
plot(logo)

m <- matrix(1, ncol=5, nrow=5)
r <- focal(logo, m, fun="mean", na.rm=TRUE, NAonly=TRUE, pad=TRUE) 
plot(r)

To see for which cells the values were estimated

plot(as(reclassify(is.na(logo), cbind(0, NA)), 'SpatialPolygons'), add=TRUE)

Also, if you want to write the raster to a file, instead of using writeRaster you should do that in one step:

r <- focal(logo, m, fun="mean", na.rm=TRUE, NAonly=TRUE, pad=TRUE, filename="smooth.tif", overwrite=TRUE)

Related Solutions

[GIS] Sampling random points from large raster file with replacement using R

Try the following code. The idea is to create a set of [row, col] indicators based on the RasterStack dimensions (rows and cols). Than you can easily used these indicators on the stack to subset all values.

The function below sampleStack gets a RasterStack and n number of values to sample, and gives a data frame with [row, col] positions and values extracted by layer.

library(raster)
# Generate raster layers

r <- raster(matrix(rnorm(100, 0, 1), nrow = 10))
for (i in 1:5) {r <- stack(r, raster(matrix(rnorm(100, 0, 1), nrow = 10)))} # run 5 times

sampleStack <- function(r, n) {
  rowSample <- sample(1:r@nrows, size = n, replace = TRUE)
  colSample <- sample(1:r@ncols, size = n, replace = TRUE)
  pairs <- data.frame("rowInd" = rowSample, "colInd" = colSample)
  out <- as.data.frame(cbind(pairs, as.data.frame(t(apply(pairs, MARGIN = 1, function(x) {return(r[x[1],x[2]])})))))
  colnames(out)[3:ncol(out)] <- names(r)
  return(out)
}

# Example Run
sampleStack(r = r, n = 14)

   rowInd colInd   layer.1.1  layer.2.1   layer.1.2  layer.2.2    layer.1    layer.2
1       5      4 -0.09678111  1.6844843  0.82574090  0.5328165  0.66721846  0.1936958
2       4      8  0.64982724  0.5467126  1.59975344  0.2757094  0.94797866 -0.1798319
3       3      4 -1.35927393 -1.2774878  0.77616160  0.1429519  1.10396643  1.1444793
4       7      9  1.45380719 -0.6128730 -0.53011041 -0.3138787 -0.86586255  0.9056694
5       6      2 -0.49808353 -0.1272448 -1.96004940 -0.5663870 -0.07217682 -1.7568981
6       4     10  0.01607546 -0.5113896  1.19713933 -1.6322803 -1.04051134  0.7135125
7       2      4  1.10798593 -0.2610036  0.56009222  2.4618433 -0.44356484  1.0332427
8       3      1  1.70168812  1.7643488 -0.09976064 -0.2386893 -1.04266622  0.2019014
9      10      3 -1.82791351  1.5666126 -1.79275437  0.2946007  0.96467732 -0.6951626
10      6      4  0.54795051  0.1378088 -1.53793046 -0.5989934 -1.64424273 -0.3463153
11      1      2 -0.86287672 -0.2408750 -0.81438516 -2.0200205  1.16523355  0.4052408
12      2      2 -0.47916254 -0.6778470  0.79086436 -0.5692255  0.96205715 -0.5146865
13      5      4 -0.09678111  1.6844843  0.82574090  0.5328165  0.66721846  0.1936958
14      6      1 -1.04973148 -0.3973457 -0.24445969  0.4061588 -1.50143806  0.3896232

[GIS] Rasters in R – merge taking mean, excluding NAs. Overlay and Mosaic both giving funny results

I think the problem with:

water_mean_overlay <- do.call(overlay, c(water_rasters, fun = mean, na.rm = T))

is that na.rm=T is being passed to overlay, and the help for overlay doesn't show an na.rm argument, so it is probably getting gobbled into the ... argument and then I'm not sure what's happening to it. Maybe it gets treated as another raster of all "TRUE" values? Or it just gets lost. Anyway...

The fix is to write a function that wraps mean but passes na.rm=TRUE:

> mean_narm = function(x,...){mean(x,na.rm=TRUE)}
> water_mean_overlay <- do.call(overlay, c(water_rasters, fun = mean_narm))
> plot(water_mean_overlay,col=rainbow(4))

but beware that mean_narm(c(NA,NA)) is NaN, and that might not be expected, and you might want to set all NaN back to NA in your results raster.

Best Answer

Related Solutions

[GIS] Sampling random points from large raster file with replacement using R

[GIS] Rasters in R – merge taking mean, excluding NAs. Overlay and Mosaic both giving funny results

Related Question