[GIS] Finding clusters of one set of points around another set of points from different layer

arcgis-desktopclusteringgeoprocessing

I want to see if there is clustering of a certain type of buildings (x) around another type of buildings (y).

The two point files are in different layers.

I cannot figure out which tool I would use to do this.

Best Answer

None of the out-of-the-box tools in ArcGIS (or any other GIS, AFAIK) will do the job correctly.

In a problem like this you need to quantify what you mean by "clustering" and then you need to posit a probability model to assess whether the measured degree of clustering could have been produced by accidental chances.

As an example of how to proceed, you might choose to measure clustering in terms of typical distances between buildings of type x and the nearest building of type y. This is an easy calculation: simply represent both sets of buildings by separate point layers and perform a spatial join of the Y's to the X's. The attribute table, which still has one record for each type x building, will now include the distance to the nearest y. You could use the average distance as your measure.

Testing whether this could be the result of chance is trickier. One plausible interpretation of this setting is that the earlier presence of y type buildings encouraged the development of x type buildings relatively close to the y's. Otherwise, we might hypothesize that the x type buildings could have been built anywhere that other buildings also appeared. This leads to the following simple permutation test. Create a point layer of all possible locations where x type buildings might have appeared. This layer could be the locations of all buildings in the area erected during the same period as the x buildings were (including the x buildings themselves, of course). Spatially join the y layer to obtain the distances to the nearest y type building. The rest of the calculation works off the attribute table: the geographic calculations are done. What you will do is repeatedly use a random number generator to take a simple random sample of all of these buildings, each sample having exactly as many elements as you have x type buildings. Compute the average distance for this sample. Repeat until you have many average-distance statistics. If almost all these randomly obtained average distances are greater than the average distance you measured for the x type buildings, you can conclude that the x's are not clustered by chance: the effect is real.

(Such calculations are best programmed on a platform suited to such purposes, such as `R', but almost any computing software can be pressed into service, even Excel. The programming is very, requiring little more than knowing how to write loops and select elements from arrays at random.)

This permutation testing approach is superior to pre-programmed solutions because it explicitly accounts for the patterns of building development in this area. If you don't do this, you often will find "significant" evidence of clustering, but you can't conclude anything useful from it, because the clustering may have been caused by other factors such as the patterns of roads, the locations of sites suitable for development, and many other things.