Raster – How to Rasterize Point Data into a Grid Ensuring One Point Per Cell

rrastervector-grid

I'm trying to create a raster whose grid cell takes at most one point. The data looks like this:

> head(ras)
       lon    lat         z1    z2
1 267573.9 2633781 213.29545     6
2 262224.4 2643701  69.78261    15
3 263742.7 2670841  51.21951     1
4 259328.4 2739781 301.98413    29
5 264109.8 2463763 141.72414    11
6 255094.8 2063428  88.90244    35

z1/z2 are measurements of two variables, which can be put into separate layers, i.e. layer 1 takes z1 values, and layer 2 takes z2 values. The two layers use the the same grid system. Empty cells can be assigned with 0 or NA.

Best Answer

as mentioned, find your smallest distance between points using dist (against a matrix of xy) and use this as the cell diagonal, to derive the resolution. This bit is important (see later)

# create some random data
df <- data.frame(x=sample(0:100,25),y=sample(0:100,25), z1 = sample(1:10,25,replace = T), z2 = sample(100:1000,25,replace = T))

# use 'dist' to create a matrix of Euclidean distances between points. 
# Must be done on an xy matrix/data.frame otherwise z will be used as height
dis <- dist(df[,c(1,2)])

# find your minimum distance between points, this is your cell diagonal
diag <- min(dis)

# convert the cell diagonal to resolution (trig)
res <- sqrt((diag^2)/2)

# convert data frame of points to points, easier to rasetrize as non-regular
pts <- df
coordinates(pts) <- ~x+y

# create a template raster:
# the extent covers the point extent
# the resolution is set by the min distance as derived from the dist matrix
r <- raster(ext=extent(pts),res=res)

# create a blank stack and loop through your attributes
st <- stack()
for(z in names(pts)){
  rasOut<-rasterize(pts, r, z)
 st <- stack(st,rasOut) 
}

The interesting thing about using the min distance between any 2 points as the cell diagonal, and not the resolution, is the effect it has on cell size. The cell diagonal is always 41.4% bigger than the side resolution so you shrink the cell size quite significantly (by exactly half in terms of area), which is quite a lot.

However if you use the min distance as the resolution, you may, on rare occasion, end up with 2 points in one cell. Consider this toy e.g.;

# data frame, just a few points 
df <- data.frame(x=c(-2.2,1,4,8,10),y=c(1,5,4,7,2), z = sample(1:10,5,replace = T))

# distance matrix
dis <- dist(df[,c(1,2)])

# set the cell resolution as the min distance between 2 points
res <- min(dis)

# convert to spatial points
pts <- df
coordinates(pts) <- ~x+y

# create blank raster
r <- raster(ext=extent(pts),res=res)

# rasterize
rasOut <- rasterize(pts, r, pts$z)

# plot and see the 2 points in 1 cell
plot(rasOut)
plot(pts,add=T)

Just as a quirk of the extent + resolution, the cell size is now not big enough to guarantee all points are in their own cell, due to the fact min distance has been used as the resolution.

I have to admit when i ran the 2nd example a number of times with a greater number of random points etc, i don't think i ever saw this theoretical outcome reproduced but the point is, it can occur. So it is better to use the 1st approach.

N.B. 1) I'm guessing your points cannot be coincident? otherwise you've got to decide on how to work these data points. 2) You might want to extend your extent so some points dont lie right on the edges but that's up to you

Related Solutions

Raster – Transforming Geostationary Satellite Image to Lon/Lat Using GDAL

Your attempt is designed to fail. If you look at the image, you see the data arranged as a circle, with black triangles in the corners of the square, where the satellite view goes right into orbit. In your test data, you see only NODATA -32768 for those parts of the image.

The extent is between +/-75 and +/- 78, but these values are only reached in the middle of the egdes. So you can not reproject those black triangles to Earth surface coordinates.

UPDATE

The Metadata of the HDF file reveals some mysteries:

Altitude=42164 
Ancillary_Files=MSG+0000.3km.lat

So the satellite height is the same as mentioned in http://geotiff.maptools.org/proj_list/geos.html, and I assume they took the same ellipsoid (not exactly WGS84).

With the help of http://www.cgms-info.org/documents/pdf_cgms_03.pdf and http://publications.jrc.ec.europa.eu/repository/bitstream/JRC52438/combal_noel_msg_final.pdf, I found that the size of 3712px is not the real extent covered by the data. The size provides a scanning angle of the satellite of about +/-8.915 degree, but the angle that was used is smaller.

Proj.4 calculates the extent by multiplying the satellite's scanning angle by the height above ground (see http://proj4.org/projections/geos.html). So with a bit of try, an extent of +/- 5568000m (3712*3000m/2 or 8.915*pi()/180*35785831m) fits to the 3712px used in the 3-km-resolution hdf.

So the correct translation commands are:

gdal_translate -a_srs "+proj=geos +h=35785831 +a=6378169 +b=6356583.8 +no_defs" -a_ullr -5568000 5568000 5568000 -5568000 HDF5:"SEV_AERUS-AEROSOL-D3_2006-01-01_V1-03.h5"://ANGS_06_16 temp.tif
gdalwarp -t_srs EPSG:4326 -wo SOURCE_EXTRA=100 temp.tif output.tif

And the result looks good:

As an alternative, you can take the lat and lon subdatasets from http://www.icare.univ-lille1.fr/archive/?dir=GEO/STATIC/ in file MSG+0000.3km.hdf

[GIS] Assign values to a subset of cells of a raster

Here's a way, first I'll create a fake data set.

library(raster)
r <- raster(matrix(1:30, 5, 6))

## this is the full data set in data frame form
dfull <- as.data.frame(r, xy = TRUE)

## this is the partial data set, only the points with a valid value
## row-order doesn't matter, but we keep it for illustration
set.seed(10)
dpart <- dfull[sort(sample(seq_len(nrow(dfull)), 22)), ]

Now we need raster's cell-abstraction tools. Here we can treat r like an raw specification of the original raster, and in fact create it from scratch if needed. But, we have it so we use it.

rspec <- raster(r)  ## this drops the data, keeps the structure

This fills the data with missing values , because raster doesn't truly have "sparse forms", they are either empty or full and we cannot put values piece-wise into an empty raster, it's either all or nothing until it's not empty.

(Note that sparse forms are supported completely by this approach, but via a level of abstraction that is the responsibility of the user)

rspec[] <- NA_real_

Now we need an index into the "structure of the raster" for our points.

## these names were nominated above, and might be different for a different
## input
cells <- cellFromXY(rspec, as.matrix(dpart[, c("x", "y")])

Now, put the values in the data frame into the otherwise "filled with missing" raster.

rspec[cells] <- dpart$layer

All this is illustrated in full here: http://rpubs.com/cyclemumner/294656

It's a very powerful approach, but it's not widely understood and it's easy to get it wrong, so do use with caution and take time to practice and understand it.

Happy to help if it doesn't make sense.

Best Answer

Related Solutions

Raster – Transforming Geostationary Satellite Image to Lon/Lat Using GDAL

[GIS] Assign values to a subset of cells of a raster

Related Question