[GIS] ny efficient way to create a raster with pixel value representing a unique id calculated based on pixel lat/lon

cgdalpythonraster

I have 10m raster for each state in US. Current pixel value represents something. I want to have 'same' raster but pixel size representing a unique id calculated based on pixel lat and lon. I am wondering whether there is any efficient way to do it, ArcGIS existing function or python C# programming are both fine.

I cannot do anything with Raster Calculator in ArcGIS since the pixel value of current raster has nothing to do with unique id or lat/lon.

So far, I tried to create lat/lon pair and calculate such id with programming. But in order keep a connection with the raster, I also read current pixel value out. So I have one record for one pixel, formatted as lon, lat, pixelVal, uniqueId, and I write it into a csv file. Since for 10m raster, the number of pixels is very very large. I am wondering whether I can do it on raster level, instead of using this way. Thanks in advance!

Best Answer

There are many ways to assign unique identifiers to raster cells. Some are listed below. Please note that although unique identifiers can be efficiently generated, any algorithm that requires unique identifiers for cells in large rasters (with many millions of more of cells) is itself going to be inefficient; in many cases it likely could be replaced by an algorithm not requiring every cell to have an identifier. Note, too, there is a natural limit to the size of such a grid, because even 32-bit grids can support at most 2^32-1 (about four billion) unique values. Methods to overcome that limit (without resorting to 64-bit grids) are given at the end.

Using a grid of column identifiers I (with values running from 0 up to one less than the number of columns, m) and a similar grid of row identifiers J (with values from 0 through n-1), compute
```
n * "I" + "J"
```
A convenient alternative is to round n up to the next power of ten equal to or exceeding n. The row and column coordinates can then be read directly off the point identifiers. For instance, with n = 12796 and m = 8092, use the formula 100000 * I + J. (Because the largest possible value of 809112795 will not overflow a signed 32-bit integer, this will work.) For instance, if a point has the identifier 49160008 it must have (row, column) coordinates (491, 60008) found by splitting the identifier into its last five digits and the remaining prefix thereof.

Column and row identifier grids can be computed (in one single calculation each) using the methods described at How do I calculate a weighted centroid with respect to a raster, e.g. population weighted centroid.
For smallish grids, simply use Combine("I", "J"). This operation will create internal unique pixel identifiers.
For very small grids (with fewer than 2^16 cells or so), generate a grid of uniformly distributed integers between 0 and the largest value possible (2^31-1 usually). Query the attribute table for counts larger than 1. If they exist, just generate a new random grid, repeating until all values are unique.
Don't even bother. If you ever need to identify a pixel, extract the values from the two grids I and J. Now there is no practical limit to the size of the grid--it need only have fewer than four billion rows and four billion columns.
The roles played by I and J can be replaced by grids of X and Y coordinates, respectively, but (a) the calculation in (1) is more delicate because it will be difficult to avoid potential collisions of IDs and (2) when the cellsize is very small (such as a few meters or less) single-precision floats usually won't distinguish all neighboring cells, so double precision will be necessary.
X and Y coordinate grids can themselves be replaced by other grids. Good ones are (signed) distances from appropriate curves: often two such distances suffice to uniquely identify every point close to the intersection of those curves. This provides a way to use non-Cartesian coordinate systems (such as polar or hyperbolic coordinates).

Related Solutions

[GIS] Find index location of a lat/lon point on a raster grid in ArcPy

It's quite simple unless the raster is rotated. You just need to know the x,y coordinates of the point, the pixel size (width and height) and the coordinates of the upper left (xmin,ymax) of the raster.

import arcview,arcpy

def map_to_pixel(point_x, point_y, cellx, celly, xmin, ymax):
    col = int((point_x - xmin) / cellx)
    row = int((point_y - ymax) / -celly)
    return col,row

point_x=your_x_coordinate
point_y=your_y_coordinate

dsc=arcpy.Describe(your_raster)

cellx=dsc.MeanCellHeight
celly=dsc.MeanCellWidth

xmin=dsc.extent.XMin
ymax=dsc.extent.YMax

col,row=map_to_pixel(point_x, point_y, cellx, celly, xmin, ymax)

Pixels are numbered from the top left.

[GIS] How to extract value from a raster based on lat and lon

To get the appropriate cell coordinates from your latitude and longitude you need to know the coverage of the netCDF file (for example, the coordinate of the upper left cell and the size of the cells in the x and y directions). @Luke 's solution below works for a srtaight x/y raster with no cell rotation. If you don't have that, you could look at the affine library to transform latitude and longitude to cell coordinates. Given GDAL GeoTransform in the form:

Upper left cell x coordinate
x cell size in projected units
x directional rotation (usually 0)
Upper left cell y coordinate
y directional rotation (usually 0)
Negative y cell size in projected units

So for example the -237481.5, 425.0, 0.0, 237536.4, 0.0, -425.0 would translate as an upper left cell of the raster with coordinates -237481.5, 237536.4, and a 425.0 unit square with no rotation.

Using the affine library you can transform this into a transformation object like so:

from affine import Affine
aff = Affine.from_gdal(-237481.5, 425.0, 0.0, 237536.4, 0.0, -425.0)

Which can transform cell coordinates to projected coordinates:

xs = np.array([0, 1, 2])
ys = np.array([3, 4, 5])

x_coords, y_coords = aff * (xs, ys)

Or (in your case) from coordinates back to cell coordinates:

xs, ys = ~aff * (np.array(lons), np.array(lats))

These values are floats, so you'll need to transform them into integers to get cell coordinates you can use.

xs = np.round(xs).astype(np.int)
ys = np.round(ys).astype(np.int)

You can then use these as indexes to your netCDF4 array (use the latest version of netCDF4 - 1.2.1 as this means you don't need to sort or remove duplicates).

variable = fnc.variables['variable'][xs, ys] # be careful of your index order

This will return a square array due to the slightly different netCDF indexing, but you can get the actual values that you're after as the diagnoal:

values = variable.diagonal()

Best Answer

Related Solutions

[GIS] Find index location of a lat/lon point on a raster grid in ArcPy

[GIS] How to extract value from a raster based on lat and lon

Related Question