So I've created a file that reads into a data frame in the same way as yours:
> str(v2)
'data.frame': 360 obs. of 720 variables:
BUT data.frame isn't really the right thing here. Its really meant for record-oriented data, where each row is a record and each column is a potentially different variable for that record (eg each row is a person, the columns are name, age, height, etc).
So you really only need to scan
the data in as one long vector and feed it to a raster.
Step 1, define an empty raster of the right size and shape (note I'm assuming the raster covers the whole world, so the limits are not the cell centres):
> m2=raster(nrow=360,ncol=720,xmn=-180,xmx=180,ymn=-90,ymx=90)
Step 2, read numeric values into the raster data slot:
> m2[]=scan("d.txt",what=1)
Read 259200 items
And give it a projection if needed:
> projection(m2)="+init=epsg:4326"
> plot(m2)
If you want to check that the resolution and the cell centres are as expected, use these functions:
> res(m2)
[1] 0.5 0.5
> xFromCol(m2,1:10)
[1] -179.75 -179.25 -178.75 -178.25 -177.75 -177.25 -176.75 -176.25 -175.75
[10] -175.25
> yFromRow(m2,1:10)
[1] 89.75 89.25 88.75 88.25 87.75 87.25 86.75 86.25 85.75 85.25
which shows the resolution is half a degree and the cell centres (or at least the first 10) are at those specified coordinates.
Depending on whether you want a mean or a spatially weighted mean there are a couple things you can do using the rgeos
and raster
packages
library(rgeos)
proj4string(grid) <- proj4string(nuts) # I assumed these were the same projection?
First find out which grid cells intersect your NUTS polygons
grid_nuts <- gIntersects(grid,nuts,byid = TRUE)
Then you can use the apply()
function to calculate the mean, min, and max of your value. However, for the mean this strategy will calculate the mean of any grid cell which touches that NUTS polygon.
nuts@data$average_value <- apply(grid_nuts,1,function(x) mean(grid@data$value[x]))
nuts@data$min_value <- apply(grid_nuts,1,function(x) min(grid@data$value[x]))
nuts@data$max_value <- apply(grid_nuts,1,function(x) max(grid@data$value[x]))
That might work for you, but if you're bothered by the fact that a grid cell that just overlaps NUTS a couple square km is worth the same as one that is totally overlapping, then you need a spatially weighted mean.
Using the intersect()
function from the raster
package you get the intersection of grid
and nuts
and it merges the dataframes. (which means the grid cells that are shared between 2 NUTS polygons are split into 2)
library(raster)
new_spdf <- intersect(grid,nuts)
Then as before, find out which of the new grid cells overlap
grid_nuts <- gIntersects(new_spdf,nuts,byid = TRUE)
There's probably a more 'elegant' way to arrange this, but with this you can take all the values in a particular NUTS polygon, multiply by each the areas of the corresponding new grid cells then divide by the total area of the NUTS polygon
EDIT: added the if statement to avoid non-overlapping NUTS polygons
for(i in 1:length(nuts)){
if(any(grid_nuts[i,])){
nuts@data$average_spatial_value[i] <- mean(new_spdf@data$value[grid_nuts[i,]]*
gArea(new_spdf[grid_nuts[i,],])/
gArea(nuts[i,]))
} else {
nuts@data$average_spatial_value[i] <- NA
}
}
Best Answer
If you want to ignore values lower than 0, you need to mask this values off. You can use
mask
with a logic test. I will create a dummy raster and points to apply extract:I don't know why you want to calculate mean extracting values with points. When you extract by point, you get the value of the specific coordinate of that point. In the case of polygons or lines, the function is applied. But with your example:
To omit those values, apply a mask selecting all value upper than 0. The result is a raster of 1 and 0, so use 0 as a mask value: