[Math] Greedy Algorithm Proof

algorithmscomputer scienceoptimization

My problem seems similar to the Interval Scheduling problem (processing as many jobs as possible), which I understand but can't seem to apply properly in this case. I've tried to simplify the problem as much as I can:

There are n jobs, each of which must be processed first on computer A and then on computer B. Each job has two specific processing times, one for each computer. Computer A can process one job at a time while computer B can process an unlimited number of jobs at the same time. Using a greedy algorithm, how can the jobs be ordered such that the total processing time of all the jobs is minimized and how can that algorithm be proven to be the optimal solution?

For example, say that there are 3 jobs with the following process times for each computer:

i = 1, A(1) = 05, B(1) = 20

i = 2, A(2) = 10, B(2) = 10

i = 3, A(3) = 20, B(3) = 05

All the jobs will be out of computer A after 35 seconds regardless of the order in which they are started, but that order will affect the overall finish time. For example, running the jobs in the order given takes 40 seconds to finish, while running the jobs in the reverse order takes 55 seconds. So in this example, the optimal solution is to run the jobs in the given order.

I'm fairly certain that the "algorithm" is to simply sort the jobs by their B processing time from longest to shortest, but I'm having a hard time proving it. I'm getting caught up with my greedy algorithm taking the "longest" job first, which doesn't seem very greedy. And then to use an inductive proof I would need begin by showing that my algorithm works for just the first job, which I would think is not obvious until the end. Maybe I need to start the proof with the final job and work backwards? Any help would be greatly appreciated.

Best Answer

We have a set of jobs $J_i = (a_i, b_i)$ where all the $a_i,b_i \ge 0$. We want to find the permutation $\pi$ which minimises $$\max_j \left\{ b_{\pi(j)} + \sum_{k \le j}a_{\pi(k)} \right\}$$

Suppose we have proven that for all job sets $X$ of size no greater than $n$ the permutation which sorts the jobs by descending $b_i$ breaking ties by descending $a_i$ is an optimal solution (bearing in mind that there may be more than one optimal solution). (Call this ordering doubly-reverse-lexicographical because it sorts in descending order starting from the right).

Consider a job set $Y$ of size $n+1$ whose "worst" job is $J_{n+1}$, and suppose that there is no optimal solution in which this job comes first. Of the optimal solutions select one which puts $J_{n+1}$ as early as possible, in position $j^* = \pi^{-1}(n+1) > 0$. Now, which $j$ maximises the expression above (i.e. which job finishes last)?

Given that $b_{\pi(j^*)}$ is maximal by construction and the sum is non-decreasing the maximising $j$ must be $\ge j^*$. Also, if the maximising $j > j^*$ we can move $J_{n+1}$ to the start without affecting the optimality, so by contradiction that's impossible. We're forced to conclude that $J_{n+1}$ is the job which finishes last.

Now consider what happens if we swap it with the job before it. Call that job $J_p$ for previous. $J_p$'s finishing time increases from $b_p + \sum_{k < j^*}a_{\pi(k)}$ to $b_p + \sum_{k \le j^*}a_{\pi(k)} \le b_{n+1} + \sum_{k \le j^*}a_{\pi(k)}$ and $J_{n+1}$'s finishing time decreases from $b_{n+1} + \sum_{k \le j^*}a_{\pi(k)}$ to $b_{n+1} + \sum_{k \le j^* \wedge k \ne j^*-1}a_{\pi(k)}$ so we have a contradiction with this being an optimal solution which puts $J_{n+1}$ as early as possible.

We've now proven that there is an optimal solution for $Y$ in which $J_{n+1}$ comes first. Either some of these optimal solutions allow for the rest of the jobs to be ordered non-optimally because the worst job is so bad that it doesn't matter – in which case putting them in an optimal ordering will give an optimal ordering – or the rest of the jobs must be optimally ordered to get an optimal ordering. In either case we can apply our strong inductive presupposition to say that there is an optimal ordering which is doubly-reverse-lexicographical.

The base case of only one job is trivial.