[Math] Proving optimality of simple greedy algorithm

algorithmscomputer science

Professor Xavier (yeah, the one from X-Men) wants to drive from Reno to Newark.

His gas tank, when full, holds enough gas to drive $n$ kilometers. The professor has a map showing the gas stations along his road and the distances between them.

The professor starts his journey with a full tank, and he wants to get to Newark with doing as little stops for gas refill as possible.

1) Present a greedy algorithm through which the professor can determine at which gas stations he should stop.

2) Prove that the algorithm you devised yields the optimal solution.

What I did:

The algorithm is pretty straightforward and it should be apparent immediately why it yields the best result.

The algorithm is:

Step 1: Drive to the furthest gas station you can without running out of gas.

Step 2: Refill. Repeat step 1.

Until he reaches Newark.

But how do I prove that this is optimal?

Best Answer

Assume the algorithm is not correct.

Let $g_1,...,g_n$ be the gas stations we get when using the algorithm. Since the algorithm is not correct we can find $g_1^*,..,g_k^*$ solving the problem with $k<n$, $k$ minimal. Now choose a solution $g_1^*,..,g_k^*$, such that $g_1=g_1^*,..., g_{l-1}=g_{l-1}^*$ and $g_l \neq g_l^* $ with l being maximal. I.e. there exists no optimal solution (using k gas stations) so that $g_1=g_1^*,..., g_{l}=g_{l}^*$ holds. Or in other words $g_l^*$ is the first gas station that is different from the gas station chosen by the algorithm and there exits no optimal solution such that the first $l$ gas stations are the same. Let $d(g)$ be the distance between the gas station $g$ and our start. Because of our algorithm $d(g_l^*)-d(g_{l-1}^*)$ must be smaller then $d(g_l)-d(g_{l-1})$ This is because our algorithm chooses the gas station with the highest distance between the new gas sation and the current gas station, that can be driven. (it cannot be equal since this would be a contradiction to the choice of l ).

But now $g_1^*,...g_{l-1}^*, \color{red}{g_{l}^* := g_l},g_{l+1}^*,...,g_k^*$ is an optimal solution such that $g_1=g_1^*,..., g_{l}=g_{l}^*$ holds. Which is a contradiction to the choice of l being maximal.

Going further it's now clear that there is an optimal solution such that $g_1=g_1^*,..., g_{k}=g_{k}^*$ holds for some optimal solution $(*)$. But this is a contradiction to the algorithm again because our optimal solutions $g_1^*,...,g_{k}^*$ tells us that we can reach the ending point without any extra gas station. If this is the case the algorithm would have chosen this end point instead of another gas station $g_{k+1}$

Thus the algorithm is correct.

(*) As suggested this result can also be achieved by induction over $l$. For $l=1$ its clear and for the step from $l=n$ to $n+1$ you can construct an optimal solution with the same arguements as above)

Related Question