Proof by contradiction for activity selection problem

computer scienceproof-writingsolution-verification

I have the following example of an activity selection problem:

Given a set of tasks, each with an associated start time and finish time, select the largest subset of the tasks that can be performed without any incompatibilities.
– Two tasks are incompatible if they overlap in time.
e.g., for $\{ (6,9), (1,10), (2,4), (1,7), (5,6), (8,11), (9,11) \}$, the following schedules are all valid: $\{ (1,10) \}, \{ (1,7), (8,11) \}, \{ (2,4), (5,6), (9,11) \}$

In this case, it turns out there is a provably-optimal greedy rule: always choose the task that finishes earliest.
e.g., given $\{ [6,9), [1,10), [2,4), [1,7), [5,6), [8,11), [9,11) \}$:
– Choose $[2,4)$, leaving $\{ [6,9),[5,6),[8,11),[9,11) \}$
– Choose $[5,6)$, leaving $\{ [6,9),[8,11),[9,11) \}$
– Choose $[6,9)$, leaving $\{ [9,11) \}$
– Choose $[9,11)$

The formal proof of optimality is by contradiction:
– Suppose there exists a solution $\{ t_1, t_2, \dots, t_k \}$ that does not include $[2,4)$.
– Assume that $t_1, \dots, t_k$ are ordered by finish time.
– Clearly, $t_1$ does not intersect with any of $t_2, \dots, t_k$.
– Clearly, $[2,4)$ finishes no later than $t_1$.
– Therefore, $[2,4)$ does not intersect with any of $t_2, \dots, t_k$.

I don't see how this is a logically coherent, valid proof by contradiction. It begins by saying to suppose there exists a solution $\{ t_1, t_2, \dots, t_k \}$ that does not include $[2,4)$. So far this is fine. But it then says that $[2,4)$ finishes no later than $t_1$, and so $[2,4)$ does not intersect with any of $t_2, \dots, t_k$. But there is nothing in how this proof was structured to imply that $[2,4)$ finishes no later than $t_1$ – it was just claimed, out of nowhere, in the middle of the proof. My understanding is that a proof by contradiction would require that the proof be logically set up so that it is implied, regardless of how unapparent it is, that $[2,4)$ finishes no later than $t_1$ – and then you set about going through logical steps to make it apparent. Am I misunderstanding this?

Best Answer

I think it was just poorly written. It should say that from the set of optimal solution(s) at least one contains the task with the earliest finish time. Then proceed to prove by contradiction:

Say that no optimal solution contain the task with the earliest finish time. Choose any optimal solution $\{t_{1},…,t_{n}\}$ ordered by their finish time. If you replace $t_{1}$ with the task with earliest finish time, you still end up with an optimal solution, contradiction!

So the text does not say that optimal solution must contain the task with earliest finish time. It only say that there must be an optimal solution containing the task with earliest finish time. If our only objective is to determine the maximum number of task we can finish, always choosing the task with earliest finish time is a smart move.

As an example, if an additional task $[3,5)$ is added to the example in the text, both of the followings are optimal solutions

$$ \{[2,4),[5,6),[6,9),[9,11)\}\\ \{[3,5),[5,6),[6,9),[9,11)\} $$

Related Question