[GIS] How to disperse points within polygons using ArcGIS

arcgis-10.0point

I have huge number of points. These points are within polygons. I want to disperse the points in the polygons randomly. I tried to use the disperse point extension in Arcview, but it was not useful for me. Disperse Markers tool in ArcGIS toolbox was also not useful and can’t create permanent disperse points, it just creates a view of point disperse. Are there any tools or scripts to disperse points in polygons?

enter image description here

Best Answer

There are two potentially helpful, but old ArcScripts:

Point Randomiser v1.3

Constrain Shift by Polygon. Use a polygon theme to constrain the random movement of points. This option causes points to be randomly shifted ONLY within the confines of the polygon within which they originally fall. Users can elect to use ALL or SELECTED polygons for restraining.

Point Dispersion Wizard

The Point Dispersion Wizard disperses coincident points radially, linearly, randomly within a specified radius and randomly within a containing polygon. In addition, each dispersion pattern comes with a set of user specified options. Dispersed points can be displayed as graphics in the existing theme, clipped to a new theme, or dispersed in a new theme. It was written for ArcView 3.1 using the Dialog Designer. Be sure to study the included READ_ME.txt file as it explains the usage of the Point Dispersion Wizard, especially as it pertains to the randomly within a containing polygon dispersion pattern which behaves somewhat differently than the other dispersion patterns. The random dispersion pattern options use a prime modulus multiplicative linear congruential generator to generate random variates from the uniform distribution on the interval [0,1] based on Marse and Roberts (1983).

Related Solutions

[GIS] way to “disperse markers” across multiple point layers

I don't see any direct way, but it'd be easy enough to perform a Merge (docs) on the input data, mapping a field to retain information about the origin layer. In some cases, you may be able to use this merged layer directly, and symbolize the different features within it. Perform disperse markers on the merged set, and use as-is or split on the unique features mapped from fields.

If you only care about making them non-overlapping, and not their specific location, you could displace each layers positions by some systematic amount below the actual mapped resolution of your data, so that they were naturally clustered when mapped. For example, if you had layers A and B, you could make the least two significant figures of A something independent from B:

layer    lat     lon    rule perturbed_lat perturbed_lon
    A   33.0  -120.0  +0.033        33.033      -119.967
    B   33.0  -120.0  +0.011        33.011      -120.989

Now if you perform disperse on one of these layers, the data will already be clustered into distinct areas of the map. You'd have to play with the offset values and figure out an appropriate offset. But this approach would give you a systematic way of preventing collisions between layers.

[GIS] Identification of consecutive points within a specified buffer

Given a list of point locations (preferably in projected coordinates, so that distances are easy to compute), this problem can be solved with five simpler operations:

Compute point-point distances.
For each point i, i = 1, 2, ..., identify the indexes of those points at distances less than the buffer radius (such as 1500).
Restrict those indexes to be i or greater.
Retain only the first consecutive group of indexes having no break.
Output the count of that group.

In R, each of these corresponds to one operation. To apply this sequence to each point, it's convenient to encapsulate most of the work within a function we define, thus:

#
# forward(j, xy, r) counts how many contiguous rows in array xy, starting at index j,
#                   are within (Euclidean) distance r of the jth row of xy.
#
forward <- function(j, xy, r) {
  # Steps 1 and 2: compute an array of indexes of points within distance r of point j.
  i <- which(apply(xy, 1, function(x){sum((x-xy[j,])^2) <= r^2}))
  # Step 3: select only the indexes at or after j.
  i <- i[i >= j]
  # Steps 4 and 5: retain only the first consecutive group and count it.
  length(which(i <= (1:length(i) + j)))
}

(See below for a more efficient version of this function.)

I have made this function flexible enough to accept various point lists (xy) and buffer distances (r) as parameters.

Normally, you would read a file of point locations (and, if necessary, sort them by time). Here, to show this in action, we will just generate some sample data randomly:

# Create sample data
n<-16                                     # Number of points
set.seed(17)                              # For reproducibility
xy <- matrix(rnorm(2*n) + 1:n, n, 2) * 300
#
# Display the track.
plot(xy, xlab="x", ylab="y")
lines(xy, col="Gray")

Their typical spacing is 300*Sqrt(2) = about 500. We do the calculation by applying this function to the points in the array xy (and then tacking its results back on to xy, because this would be a convenient format for export to a GIS):

radius <- 1500
z <- sapply(1:n, function(u){forward(u,xy,radius)})
result <- cbind(xy, z)                              # List of points, counts

You would then further analyze the result array, either in R or by writing it to a file and importing it into other software. Here is the result for the sample data:

                        z
  [1,]   -4.502615  551.5413 4
  [2,]  576.108979  647.8110 3
  [3,]  830.103893 1087.7863 4
  [4,]  954.819620 1390.0754 3
...
 [15,] 4977.361529 4146.7291 2
 [16,] 4783.446283 4511.9500 1

(Remember that the counts include the points at which they are based, so that each count must be 1 or greater.)

If you have many thousands of points, this method is too inefficient: it computes far too many point-to-point distances that are unnecessary. But because we have encapsulated the work within the forward function, the inefficiency is straightforward to fix. Here is a version that will work better when more than a few hundred points are involved:

forward <- function(j, xy, r) {
   n <- dim(xy)[1]     # Limit the search to the number of points in xy
   r2 <- r^2           # Pre-compute the squared distance threshold
   z <- xy[j,]         # Pre-fetch the base point coordinates
   i <- j+1            # Initialize an index into xy (just past point j)

   # Advance i while point i remains within distance r of point j.
   while(i <= n && sum((xy[i,]-z)^2) <= r2) i <- i+1

   # Return the count (including point j).
   i-j
}

To test this, I created random points as previously but varied two parameters: n (the number of points) and their standard deviation (hard-coded as 300 above). The standard deviation determines the average number of points within each buffer ("average" in the table below): the more there are, the longer this algorithm takes to run. (With more sophisticated algorithms the run time won't depend as much on how many points are in each buffer.) Here are some timings:

 Time (sec)    n    SD  Average  Distances checked per minute
 1.30       10^3     3  291      13.4 million
 1.72       10^4    30   35.7    12.5
 2.50       10^5   300    3.79    9.1
16.4        10^6  3000    1.04    3.8

Best Answer

Related Solutions

[GIS] way to “disperse markers” across multiple point layers

[GIS] Identification of consecutive points within a specified buffer

Related Question