[GIS] maximum circle inside an irregular polygon from a random point

circlepolygon

I am trying to work out an algorithm for creating the maximum radius circle within an irregular polygon (a census tract) based on a given center of the circle.

The motivation for this is to obscure the location of an individual who has answered a survey. Their actual location is known, however it needs to be obscured in the analysis, in order to release the data to the public, for further analysis.

We want to have a donut shaped polygon for each survey respondent that has some inside radius (easy), bounded by an outside radius that is constrained by the census tract that the individual is in. Their final location will be randomly placed within the donut polygon.

I have seen many answers to similar questions here, but not this specific one, which in this case starts with a SPECIFIC location.

Once we have the donut established, we can randomize the location of the individual response within the polygon. That's relatively easy…

Thanks for your ideas, mine so far have seemed pretty brute force, and computationally "expensive" or inefficient…

Best Answer

A simple method to move locations within such annuli exploits a gridded representation of the distance to the tract boundary. Beginning with a polygonal representation of the Census tracts (which is the usual),

  1. Convert that to the polygon boundaries (a polyline layer).

  2. Compute the Euclidean distance grid to the boundaries.

  3. Extract the Euclidean distances at the given locations.

  4. Move each location within the range given by the distance--which, by definition, is the maximum to the boundary.

Each typically requires just a single command with a GIS, making the entire sequence easily automated and easily carried out manually. These are efficient commands, because they do not require constructing a buffer for each point (which typically creates several dozen to almost a thousand points in order to describe a ring, or annulus). No searching or random trials are needed, either: the points are directly displaced by amounts guaranteed to leave them within their original Census tracts.


For example, I moved 172,902 locations within 47 tracts in random directions by displacements uniformly distributed between one-half the distance and the full distance to the boundary. Here is a part of one tract before the move:

Figure 1

(yellow squares mark the locations) and after the move:

Figure 2

(now gray squares mark the new locations). The total operation took just a minute or two (using an old outdated GIS :-).

By comparing these figures closely, you can see that

  • Points that are now close to the boundary (such as near the two lakes shown as white "holes" in these figures) necessarily stay close to the boundary.

  • Points far from the boundary tend to move far.

Consequently, a point close to the boundary likely (but not certainly) originated very close by, whereas any point far from the boundary likely originated from somewhere else far from the boundary. These two tendencies are far from completely random: they could (fairly easily) be exploited by someone who wishes to penetrate the privacy that these movements were intended to afford.

Better methods would make the connections between final and initial location more tenuous and more random. At a minimum, points should be moved within reasonably large neighborhoods rather than within neighborhoods of varying (and possibly arbitrarily small) size. Such movements are not readily carried out with grids, because typically they require some trial and error: you generate a bunch of random points within a neighborhood of each original point and select the first one that lies within the same Census tract. That's a loop involving (1) a random movement and (2) a point-in-polygon inquiry. Both operations are fast, but this requires a bit of programming to implement the loop.

(In a comment to the question, I provide links to some studies of methods used to disguise locational data for privacy purposes.)