Is this a geometric set cover problem

geometryoptimization

Not sure if this is actually a geometric set cover problem, as the definition I can find here is quite beyond my level of maths.
I wonder if anyone can help.

Suppose you are given a set of points on a 2D map, and you need to select some of them, where transmitters will be installed.
Each transmitter 'covers' a circular region of radius $r$ around itself.
Consider $r$ fixed.

If a transmitter were installed in each of the points, the whole desired area ($A_0$) would be covered, but there would be significant overlap between the transmitters, with consequent waste of resources.

So the question is: can one select only some of the points where to install the transmitters, so that the largest possible fraction of $A_0$ is covered, using the minimal number of transmitters?.

The objective function (to max) could be something like:

$Obj = \frac {A_t}{A_0} \cdot revenue_f – N_{t} \cdot cost_{t}$

where:

  • $N_t$ = number of transmitters installed (i.e. of selected points)
  • $cost_t$ = cost of one transmitter
  • $A_t$ = area covered by the installed transmitters
  • $A_0$ = area that would be covered if transmitters were installed in all points
  • $revenue_f$ = revenue that would be generated by covering the whole $A_0$ (so we are assuming that covering a given fraction of $A_0$ will generate the corresponding fraction of the revenue)

Here are some illustrative examples of what I described above.

20 points, installing a transmitter in each, shading their coverage areas in transparent red:

enter image description here

You can clearly see that there is a lot of overlap.

Selecting only the points in blue, shading the union of their coverage area in transparent blue, overlaid with the union of the full coverage area in red:

enter image description here

This does not do too bad a job, using only 5 out of 20 transmitters, but it leaves a substantial area on the top-right uncovered.
And it does not seem to be an 'optimal' solution: everything else being equal, 15 would have been preferable to 7, as the latter is already in a 'crowded' area.

Intuitively one might think: select one starting point, then select the point farthest from it (so their circles overlap as little as possible), and so on, at each step selecting the point that is farthest from all the currently selected ones, until adding the next point increases the cost without increasing the revenue.
In this sense it wouldn't sound very different from a p-dispersion (maxmin) type of problem.

So, I am trying to understand if this is a known problem, and what type of algorithm could be used to solve it.

Any ideas?

Thanks!


To find the area covered by a subset of the circles ($A_t$), one idea could be to sum the areas of the selected circles, and subtract the areas of all their pairwise intersections.
[NOTE: as later commented, this would not work. It looks like the correct formula for the area of the union of $n$ circles is:

$\sum_{k=1}^n {A(C_k)} + \sum_{i=2}^n {(-1)^{i-1} \cdot ( \sum_{s \in S_i} {I(s)} )}$

where:

  • $A(C_k)$ is the area of circle $C_k$
  • $I$ means 'area of the intersection of'
  • $S_i$ is the set of all combinations of $i$ circles one can form from the $n$ selected ones.

So for $n = 2$ the pairwise concept holds:

$\sum_{k=1}^2 {A(C_k)} + \sum_{i=2}^2 {(-1)^{i-1} \cdot ( \sum_{s \in S_i} {I(s)} )} = A(C_1) + A(C_2) – I(C_1,C_2)$

But already for $n = 3$ one would need to calculate and add back in the intersection of all 3 circles (which I don't even know how to find):

$\sum_{k=1}^3 {A(C_k)} + \sum_{i=2}^3 {(-1)^{i-1} \cdot ( \sum_{s \in S_i} {I(s)} )} = A(C_1) + A(C_2) + A(C_3) – I(C_1,C_2) – I(C_1,C_3) – I(C_2,C_3) + I(C_1,C_2,C_3)$

BTW, my notation must be all over the place, apologies for that.
]

From this post plus some additional calculations, I worked out that the area of the intersection of 2 circles of identical radius $r$ whose centres are separated by a distance $d$ is:

$2 \cdot r^2 \cdot \arccos ({\frac d {2 \cdot r}}) – \frac d 2 \cdot \sqrt {4 \cdot r^2 – d^2}$

So there might be a way to solve this as a binary linear programming problem, although one with a lot of variables and constraints (to code the pairwise intersections), so probably unfeasible when one has anything more than a few dozens points to start with.
[NOTE: and as mentioned above, actually much more complicated than I thought].


EDIT: Following up from RobPratt's suggestion

Here is my R code with the implementation of a set cover problem where the aim is only to cover the initial set of centres by the smallest possible number of circles.

# Goal: given a set of N points in 2D, select the minimal possible subset of points such that, by drawing a circle of radius r centred in each, all the N points are 'covered' by the circles

# Approach: find which points are within a radius r of each of the initial N points; this constitutes the set of points 'associated' to each point; then select the minimal number of points such that the *union* of their associated points comprises all the N points (set cover)

# 1. User parameters

# Set the total number of points to choose from
N = 20
# Set the radius of the circles to draw (should not be larger than 1)
r = 0.5

# 2. Preparation of the data

# simulate N 2D coordinates
set.seed(91238)
coords <- matrix(runif(2*N), ncol = 2, nrow = N)

# calculate the pairwise Euclidean distance matrix
dm <- as.matrix(dist(coords))
# turn the matrix into a binary contingency matrix by replacing any distances <= r by 1, the rest by 0
dm <- ifelse(dm <= r, 1, 0)

# make the angles for circle drawing
theta <- seq(0, 2*pi, length.out = 200)

# 3. Plot the initial situation
op = par()
par(mar=c(2,2,2,2))

plot(coords, asp = 1, xlim = c(min(coords[,1])-0.5, max(coords[,1])+0.5), ylim = c(min(coords[,2])-0.5, max(coords[,2])+0.5))

for (i in 1:N) {
  CX <- coords[i, 1]
  CY <- coords[i, 2]
  lines(CX + r*cos(theta), CY + r*sin(theta), col = 2, lty = 2)
  #shade("(x-CX)^2 + (y-CY)^2 < r0^2", "red", 0.9, n=200)
}

points(coords)
text(coords, labels = as.character(1:N), pos = 4, cex = 0.7)
title(main = paste0("All (",N,") points selected"))

# 4. Solve by binary linear programming

require(Rsymphony)

obj <- rep(1,N)
dir <- rep(">=",N)
rhs <- rep(1,N)
lp.out <- Rsymphony_solve_LP(obj, dm, dir, rhs, types = "B", max = FALSE)

sel <- as.numeric(dimnames(dm)[[2]])[as.logical(lp.out$solution)]

# 5. Plot the solution compared with the initial situation

plot(coords, asp = 1, xlim = c(min(coords[,1])-0.5, max(coords[,1])+0.5), ylim = c(min(coords[,2])-0.5, max(coords[,2])+0.5))

for (i in 1:N) {
  CX <- coords[i, 1]
  CY <- coords[i, 2]
  lines(CX + r*cos(theta), CY + r*sin(theta), col = 2, lty = 2)
}

for (i in sel) {
  CX <- coords[i, 1]
  CY <- coords[i, 2]
  lines(CX + r*cos(theta), CY + r*sin(theta), col = "blue", lty = 1)
  #shade("(x-CX)^2 + (y-CY)^2 < r0^2", "blue", 0.9, n=200)
}

text(coords, labels = as.character(1:N), pos = 4, cex = 0.7)
points(coords[sel,], pch = 16, col = "blue")
title(main = paste0("Points ",paste(sel,collapse=",")," selected"))

It works, as far as the goal of covering the initial points is concerned, and in this sense I find it a very interesting result, possibly of use in some contexts.
[BTW the code can surely be improved, too; I really went for the easiest path, no attempt to make any elegant or fast implementation].

However, clearly this does not cover anything close to the whole area that would be covered by all the circles, by far:

enter image description here

If I understood RobPratt's suggestion and content of the post he linked, I would then be supposed to add several more 'auxiliary' points, to 'force' the LP to include more centres, and by doing that, cover the whole area.

This is indeed possible, except that it would indeed be a 'cover the whole area' problem, not 'cover as much area as possible' as in my original formulation; as there would be no measurement of the proportion of points covered, but a set of constraint forcing the coverage of the full area.

It is still quite interesting to attempt that nevertheless.

I am just wondering if there is a way to select only a few 'crucial' additional points, e.g. those, in each circle, that are most distant from the centroid of the set of points, to avoid enormously increasing the number of constraints.
Also, if I got this right, since I cannot select any of the $M$ 'auxiliary' points, the constraint matrix would have to be an $(M+N) x N$ matrix(?).

If anyone is (still) following this post and has suggestions or thinks I am barking up the wrong tree, please let me know.


EDIT 2: implementation of the binary LP set cover with auxiliary points

It seems to work.
The auxiliary points are in red in the plot below. I found them by intersecting each circle with the line by its centre and the centroid, and keeping the intersection farther away from the centroid (in fact now I wonder if I should also keep the one closest to the centroid; or indeed as RobPratt commented, sampling the circumferences; but OK, for further development).
By including these points, the whole area is covered (at least in this example), and this highlights something I find very interesting: some very close centres are selected, because without them some areas would be left uncovered – but that also means that the p-dispersion approach would probably not be effective, as it would first and foremost try and keep the selected centres far apart(!).

enter image description here

Overall I like this result, so thanks again RobPratt for your input, you helped me think this through and reach what seems to me a rather satisfactory solution!

Best Answer

This is the cell tower placement problem, and has been pretty well studied by some folks with lots of money to spend on top algorithm designers. Your "intuitively one might think" paragraph is essentially a "greedy algorithm"; various kinds of greedy algorithms can often do OK, but they're seldom optimal.

Anyhow, I'll bet a search for "optimal cell tower placement" along with SODA or FOCS or STOC (the shorthand names for major algorithms/theory of computer science conferences) might yield you several interesting results.

Related Question