[GIS] better way to convert Raster to Point data? Don’t need one point for every cell

convertpointrasterresolution

I need to convert raster data to point data so I can use the points for interpolation via kriging. My problem is that my resolution is 5 ft and I'm working on a whole city. The data I am converting to points is grouped by zone with No Data values between; most/all of the points in each zone have the same value. I need the high resolution because some zones are linear, just 2-3 cells wide and about 20 cells long, while others are huge behemoths. I need to capture the smaller zones but I don't need 2000 points in the larger zones; because the values are all the same, I might need 10 or 20. I experimented with lowering the resolution to 50 and 100 ft but I lose much of the smaller zones. My raster is floating data so converting to polygons to do any other vector stuff isn't possible, I don't think.

Am I doomed to have a billion points and manually remove ones from the big data chunks?

What kind of processing time am I looking at if I do 5 ft Raster to Point and have one point per cell and I interpolate those? It already took several hours to interpolate between 10 points via kriging over the whole area.

I would not like to sacrifice my data resolution for the small zones if I can help it. They are very important and represent the variability I need to capture.

Best Answer

Using GDAL & Python, you could split the city zones into separate rasters, and then convert to comma-separated xyz points using gdal2xyz.py. The skip option allows you to control your sampling frequency (i.e. how many raster cells are converted to points). For example, a skip factor of 3 means that you are converting one in every three pixels:

gdal2xyz.py -skip 3 -csv InputRasterName OutCSVName

You could then use pandas to remove the nodata points:

import pandas

InputCSV = 'xyz.csv'
OutputCSV = 'xyz_v2.csv'

# Read CSV
Data = pandas.read_csv(InputCSV, sep=',', header=None, index_col=None)

# Remove no data (e.g. zero) points in your z values 
NewData = Data[Data[2] != 0] 

# Export sorted data to CSV
NewData.to_csv(OutputCSV, header=['x','y','z'], index=False)

You can then import the CSV into QGIS or ArcMap and convert the data to an ESRI shapefile or any other vector format.

Something similar can be achieved in R by using the raster package:

# Open raster
r <- raster("ImageName")

# Convert raster cells > 0 to points
points <- rasterToPoints(r, fun=function(r){r>0}, spatial=FALSE)

# another example, raster cells = 1 to points
points1 <- rasterToPoints(r, fun=function(r){r==1}, spatial=FALSE)

# another example, raster cells = 2 to points
points2 <- rasterToPoints(r, fun=function(r){r==2}, spatial=FALSE)

# take a random sample of 100 points/pixels
sample <- points[sample(1:nrow(points), 100, replace=FALSE),]
sample1 <- points1[sample(1:nrow(points1), 100, replace=FALSE),]
sample2 <- points2[sample(1:nrow(points2), 100, replace=FALSE),]

# Export points to csv
write.table(sample, file="Points.csv", row.names=FALSE, na="", col.names=TRUE, sep=",")
write.table(sample1, file="Points_1.csv", row.names=FALSE, na="", col.names=TRUE, sep=",")
write.table(sample2, file="Points_2.csv", row.names=FALSE, na="", col.names=TRUE, sep=",")
Related Question