Raster – How to Rasterize Point Data into a Grid Ensuring One Point Per Cell

rrastervector-grid

I'm trying to create a raster whose grid cell takes at most one point. The data looks like this:

> head(ras)
       lon    lat         z1    z2
1 267573.9 2633781 213.29545     6
2 262224.4 2643701  69.78261    15
3 263742.7 2670841  51.21951     1
4 259328.4 2739781 301.98413    29
5 264109.8 2463763 141.72414    11
6 255094.8 2063428  88.90244    35

z1/z2 are measurements of two variables, which can be put into separate layers, i.e. layer 1 takes z1 values, and layer 2 takes z2 values. The two layers use the the same grid system. Empty cells can be assigned with 0 or NA.

Best Answer

as mentioned, find your smallest distance between points using dist (against a matrix of xy) and use this as the cell diagonal, to derive the resolution. This bit is important (see later)

# create some random data
df <- data.frame(x=sample(0:100,25),y=sample(0:100,25), z1 = sample(1:10,25,replace = T), z2 = sample(100:1000,25,replace = T))

# use 'dist' to create a matrix of Euclidean distances between points. 
# Must be done on an xy matrix/data.frame otherwise z will be used as height
dis <- dist(df[,c(1,2)])

# find your minimum distance between points, this is your cell diagonal
diag <- min(dis)

# convert the cell diagonal to resolution (trig)
res <- sqrt((diag^2)/2)

# convert data frame of points to points, easier to rasetrize as non-regular
pts <- df
coordinates(pts) <- ~x+y

# create a template raster:
# the extent covers the point extent
# the resolution is set by the min distance as derived from the dist matrix
r <- raster(ext=extent(pts),res=res)

# create a blank stack and loop through your attributes
st <- stack()
for(z in names(pts)){
  rasOut<-rasterize(pts, r, z)
 st <- stack(st,rasOut) 
}

The interesting thing about using the min distance between any 2 points as the cell diagonal, and not the resolution, is the effect it has on cell size. The cell diagonal is always 41.4% bigger than the side resolution so you shrink the cell size quite significantly (by exactly half in terms of area), which is quite a lot.

However if you use the min distance as the resolution, you may, on rare occasion, end up with 2 points in one cell. Consider this toy e.g.;

# data frame, just a few points 
df <- data.frame(x=c(-2.2,1,4,8,10),y=c(1,5,4,7,2), z = sample(1:10,5,replace = T))

# distance matrix
dis <- dist(df[,c(1,2)])

# set the cell resolution as the min distance between 2 points
res <- min(dis)

# convert to spatial points
pts <- df
coordinates(pts) <- ~x+y

# create blank raster
r <- raster(ext=extent(pts),res=res)

# rasterize
rasOut <- rasterize(pts, r, pts$z)

# plot and see the 2 points in 1 cell
plot(rasOut)
plot(pts,add=T)

enter image description here

Just as a quirk of the extent + resolution, the cell size is now not big enough to guarantee all points are in their own cell, due to the fact min distance has been used as the resolution.

I have to admit when i ran the 2nd example a number of times with a greater number of random points etc, i don't think i ever saw this theoretical outcome reproduced but the point is, it can occur. So it is better to use the 1st approach.

N.B. 1) I'm guessing your points cannot be coincident? otherwise you've got to decide on how to work these data points. 2) You might want to extend your extent so some points dont lie right on the edges but that's up to you

Related Question