Python Raster Value – Updating Raster Value to Reflect Total Number of Points Within Cell Using Python

arcpyfeaturesgdalpythonraster

I have to do a task in Python/ArcPy/GDAL/whatever because I'm trying to automate this process for a hundred or so different raster/shapefile combos.

This is what I need to do:

I have an empty raster and a shapefile with thousands of points. I would like to update the value of each cell/pixel in the raster to reflect the number of times a unique point lands within that cell.

At first I was thinking something like this:

for cell in raster:
    for point in shapefile:
         if point intersects with (/is contained in) cell: cell_value + = 1

However, I don't know how to implement this because I don't see how to intersect a raster and a polygon. Converting the raster into polygons is out of the question because the rasters are very large with a lot of cells.

I'm not married to the idea of looping through both the individual cells and points because I feel like it's unnecessary, so if there is an ArcGIS tool(s) that I could manipulate, and/or another method in python I could use to reach my goal, I'm more than happy to try that out.

Best Answer

Using any of the tools you mentioned you can preprocess the points so that one feature (with a count) occurs in each raster cell. Converting that to a raster using the standard built-in method completes the task.

The advantage of this method is that it provides complete control over the conversion process, so you know exactly what is happening. Moreover, it easily generalizes: with only trivial modifications (at the summarize step) you can produce rasters whose values are any statistical summary of the points falling within each cell (of which the count is a simple example).

The preprocessing is relatively simple: you need to identify the cell in which any point (x,y) falls. The raster's extent is given by an origin (Ox, Oy) and corresponding cell sizes (h, k) (usually h=k). The (column, row) coordinates of (x,y)'s cell are

(i,j) = floor( (x-Ox)/h, (y-Oy)/k )

That requires two quick calculations to create new shapefile attributes i and j. Summarize the attribute table on the concatenation of i and j (which will identify the cells), retaining the total count (or any other summary) for each unique (i,j) pair, along with the values of i and j themselves. Convert the (i,j) attributes in the summary table back to coordinates of cell centers as

(x,y) = ( Ox + h*(i+1/2), Oy + k*(j+1/2) )

This produces a table with three attributes: x, y, and the summary value for the points corresponding to (x,y)'s cell. Proceed in any way you prefer to convert this into a raster.

Related Solutions

ArcGIS Desktop – Finding Nearest Raster Cell Value Based on Vector Point

A raster outline occupies a rectangle bounded on the lower left by the origin, having coordinates (say) (x0, y0), and on the upper right by (x1, y1): these are properties easily discovered of any raster. You can use this information to move any points outside the raster's extent onto the nearest point on the raster's boundary and extract the raster values at the new locations.

Tasks like this are often readily worked out by considering lower dimensions. A one-dimensional raster's extent is just an interval of numbers, say from x0 to x1 > x0, conventionally written [x0,x1]. Suppose you want to find the nearest point in this interval to a given number x. There are three cases: x < x0 (x is to the left), x0 <= x <= x1 (x is in the interval), and x1 < x (x is to the right). Clearly the nearest points are x0, x, and x1, respectively. A formula for this result is

x --> min(max(x,x0),x1),

as you can see by contemplating each of the three cases separately.

Because the extent of a raster is the Cartesian product of two intervals, [x0,x1] * [y0,y1], the same formula works in two dimensions. Just apply it to each interval. Thus the projected point's coordinates can be computed (in a Field Calculator, say) as

(x,y) --> ( min(max(x,x0),x1), min(max(y,y0),y1 ).

Here is this formula in action with 100 points randomly placed around and within a grid's extent:

The dashed lines visually associate points outside the extent to the locations where they will be moved.

Create a point layer from these computed coordinates and apply the Sample tool to extract the raster values. Join the results back to the original point layer.

As a practical matter, to avoid problems with floating point roundoff that might keep the new points from lying within the grid's extent, it would be wise to make x0 and y0 slightly bigger than the true origin (you might add half the cellsize to them) and, similarly, make x1 and y1 slightly smaller.

In summary, the workflow consists of calculating two fields (the new coordinates) in the original point layer's table, creating a point event layer from those new coordinates, running the Sample tool, and performing a database join (not a spatial join).

When you have many more vector points than there are cells in the raster, you should prefer the solution given by @celenius (who proposes converting the raster to points and using a spatial join). Normally, though, rasters have lots of cells--millions to billions--and the conversion to points is so time-consuming and disk-consuming that it should be approached with caution.

Alternatively, you could expedite Celenius' solution by using a raster calculation to make an integer grid with values only along its boundary. Converting that to a point layer is quick and easy because it will have few points. Let's call this the "boundary layer." Sample the grid at the boundary layer points, thereby copying the grid values into the boundary layer's attribute table. After running Sample with the original point layer, remove all points where a sample was successfully obtained. Spatially join the boundary layer to the remaining points to complete the sampling process.

(One way to make a grid with values only along its boundary is to apply a focal neighborhood operation that fails along the boundary: examples include finding slopes, hillshading, 3 x 3 neighborhood focal stats. This creates a grid with a one-cell-thick ring of NoData around its edge. Using IsNull and SetNull to detect NoData cells and convert between NoData and data cells produces a grid full of NoData values except around the boundary.)

[GIS] How to get X Y coordinates and cell value of each pixel in a raster using Python

Following @Dango's idea I created and tested (on small rasters with the same extent and cell size) the following code:

import arcpy, numpy

inRaster = r"C:\tmp\RastersArray.gdb\InRaster"
inRaster2 = r"C:\tmp\RastersArray.gdb\InRaster2"

##Get properties of the input raster
inRasterDesc = arcpy.Describe(inRaster)

#coordinates of the lower left corner
rasXmin = inRasterDesc.Extent.Xmin
rasYmin = inRasterDesc.Extent.Ymin

# Cell size, raster size
rasMeanCellHeight = inRasterDesc.MeanCellHeight
rasMeanCellWidth = inRasterDesc.MeanCellWidth
rasHeight = inRasterDesc.Height
rasWidth = inRasterDesc.Width

##Calculate coordinates basing on raster properties
#create numpy array of coordinates of cell centroids
def rasCentrX(rasHeight, rasWidth):
    coordX = rasXmin + (0.5*rasMeanCellWidth + rasWidth)
    return coordX
inRasterCoordX = numpy.fromfunction(rasCentrX, (rasHeight,rasWidth)) #numpy array of X coord

def rasCentrY(rasHeight, rasWidth):
    coordY = rasYmin + (0.5*rasMeanCellHeight + rasHeight)
    return coordY
inRasterCoordY = numpy.fromfunction(rasCentrY, (rasHeight,rasWidth)) #numpy array of Y coord

#combine arrays of coordinates (although array for Y is before X, dstack produces [X, Y] pairs)
inRasterCoordinates = numpy.dstack((inRasterCoordY,inRasterCoordX))


##Raster conversion to NumPy Array
#create NumPy array from input rasters 
inRasterArrayTopLeft = arcpy.RasterToNumPyArray(inRaster)
inRasterArrayTopLeft2 = arcpy.RasterToNumPyArray(inRaster2)

#flip array upside down - then lower left corner cells has the same index as cells in coordinates array
inRasterArray = numpy.flipud(inRasterArrayTopLeft)
inRasterArray2 = numpy.flipud(inRasterArrayTopLeft2)


# combine coordinates and value
inRasterFullArray = numpy.dstack((inRasterCoordinates, inRasterArray.T))

#add values from second raster
rasterValuesArray = numpy.dstack((inRasterFullArray, inRasterArray2.T))

Based on @hmfly code, you can have access to desired values:

(height, width, dim )=rasterValuesArray.shape
for row in range(0,height):
    for col in range(0,width):
        #now you have access to single array of values for one cell location

Unfortunately there's one 'but' - the code is right for NumPy arrays which can be handled by system memory. For my system (8GB), the largest array was about 9000,9000.

As my experience doesn't let me provide more help, you can consider some suggestions about dealing wiht large arrays: https://stackoverflow.com/questions/1053928/python-numpy-very-large-matrices

arcpy.RasterToNumPyArray method allows to specify the subset of raster converted to NumPy array (ArcGIS10 help page) what can be useful when chunking large dataset into submatrices.

Best Answer

Related Solutions

ArcGIS Desktop – Finding Nearest Raster Cell Value Based on Vector Point

[GIS] How to get X Y coordinates and cell value of each pixel in a raster using Python

Related Question