[GIS] ny efficient way to create a raster with pixel value representing a unique id calculated based on pixel lat/lon

cgdalpythonraster

I have 10m raster for each state in US. Current pixel value represents something. I want to have 'same' raster but pixel size representing a unique id calculated based on pixel lat and lon. I am wondering whether there is any efficient way to do it, ArcGIS existing function or python C# programming are both fine.

I cannot do anything with Raster Calculator in ArcGIS since the pixel value of current raster has nothing to do with unique id or lat/lon.

So far, I tried to create lat/lon pair and calculate such id with programming. But in order keep a connection with the raster, I also read current pixel value out. So I have one record for one pixel, formatted as lon, lat, pixelVal, uniqueId, and I write it into a csv file. Since for 10m raster, the number of pixels is very very large. I am wondering whether I can do it on raster level, instead of using this way. Thanks in advance!

Best Answer

There are many ways to assign unique identifiers to raster cells. Some are listed below. Please note that although unique identifiers can be efficiently generated, any algorithm that requires unique identifiers for cells in large rasters (with many millions of more of cells) is itself going to be inefficient; in many cases it likely could be replaced by an algorithm not requiring every cell to have an identifier. Note, too, there is a natural limit to the size of such a grid, because even 32-bit grids can support at most 2^32-1 (about four billion) unique values. Methods to overcome that limit (without resorting to 64-bit grids) are given at the end.

  1. Using a grid of column identifiers I (with values running from 0 up to one less than the number of columns, m) and a similar grid of row identifiers J (with values from 0 through n-1), compute

    n * "I" + "J"
    

    A convenient alternative is to round n up to the next power of ten equal to or exceeding n. The row and column coordinates can then be read directly off the point identifiers. For instance, with n = 12796 and m = 8092, use the formula 100000 * I + J. (Because the largest possible value of 809112795 will not overflow a signed 32-bit integer, this will work.) For instance, if a point has the identifier 49160008 it must have (row, column) coordinates (491, 60008) found by splitting the identifier into its last five digits and the remaining prefix thereof.

    Column and row identifier grids can be computed (in one single calculation each) using the methods described at How do I calculate a weighted centroid with respect to a raster, e.g. population weighted centroid.

  2. For smallish grids, simply use Combine("I", "J"). This operation will create internal unique pixel identifiers.

  3. For very small grids (with fewer than 2^16 cells or so), generate a grid of uniformly distributed integers between 0 and the largest value possible (2^31-1 usually). Query the attribute table for counts larger than 1. If they exist, just generate a new random grid, repeating until all values are unique.

  4. Don't even bother. If you ever need to identify a pixel, extract the values from the two grids I and J. Now there is no practical limit to the size of the grid--it need only have fewer than four billion rows and four billion columns.

  5. The roles played by I and J can be replaced by grids of X and Y coordinates, respectively, but (a) the calculation in (1) is more delicate because it will be difficult to avoid potential collisions of IDs and (2) when the cellsize is very small (such as a few meters or less) single-precision floats usually won't distinguish all neighboring cells, so double precision will be necessary.

  6. X and Y coordinate grids can themselves be replaced by other grids. Good ones are (signed) distances from appropriate curves: often two such distances suffice to uniquely identify every point close to the intersection of those curves. This provides a way to use non-Cartesian coordinate systems (such as polar or hyperbolic coordinates).

Related Question