[Math] Greedy choice property

algorithmscomputer sciencedynamic programming

There are two versions of the Knapsack problem, the integer and the fractional one.

The difference between the integer and the fractional version of the Knapsack problem is the following: At the integer version we want to pick each item either fully or we don't pick it. At the fractional version we can take a part of the item.

The greedy choice property is the following:

We choose at each step the "best" item, which is the one with the greatest benefit and the smallest weight. We do the same choice until the Knapsack is full or there aren't any objects that fill in the Knapsack.

Is this correct??

To prove that the fractional version has this property I have found the following:

We claim that for any instance of the fractional knapsack problem, the
optimal solution must include as much of the densest item as will fit.
We can prove this by contradiction. Suppose the optimal solution
included some amount s of a less dense item i and did not include all
that was available of the densest item j, of which there is some
amount t that is not included in the bag. Then we could switch min(s,
t) of the less dense item with the densest item and get an even more
optimal solution, which is a contradiction. By a similar argument, the
optimal solution also cannot have empty space in the bag when there
are items still available to put in the bag. (Just imagine instead of
a less dense item i, you have an imaginary item k that has no value
and is infinitely available.)

Could you explain to me this proof?? I haven't understood it…

I found in my notes that the integer(0-1) version of the Knapsack problem can be solved with dynamic programming and the fractional version with a greedy algorithm.

Could you explain it to me why this stands??

Best Answer

NB: in my understanding a problem labelled "integer" means that it has integer solutions only, and is unrelated to what if any limits are placed on those integers. The $0-1$ knapsack problem is integer, an integer knapsack problem is not necessarily the $0-1$. This is usually clarified by differentiating between the $0-1$ problem and the unbounded problem.

Your understanding on the difference between an integer and fractional problem (terminology I have been educated with: "discrete" and "continuous", respectively) is correct. A discrete problem allows only the choices of taking the object or leaving it, while a continuous problem allows for taking part of an object and leaving the remainder of it.

Your understanding of a greedy algorithm is also broadly accurate, but may need some clarification. The solution involves taking the best thing we are able at this point to take, until we reach one of the limits imposed by the problem (be it achieving maximum, or running out of objects to take). It does not necessarily involve repetition of any objects, and interestingly, can give the absolute worst solution when applied to problems with certain constraints.

Before we begin on the proof, I'm also going to define density as the value of an object divided by the volume of that object, so that we're all on the same understanding.

The proof runs as follows: say you've got a perfect solution that doesn't involve taking all the highest-density object you possibly can, so there's some amount of a lower-density object taking up space in the solution, and some amount of highest-density object left out. If you were to swap as much as you could of the lower-density object with the higher-density one (which is determined by the smaller of: volume of lower-density object taken, volume of higher-density left out) then you would increase the overall value of that volume: $density x volume = value$, so increasing $density$ and maintaining $volume$ means increasing $value$. But then you've got a better-than-perfect solution - which is a contradiction. So the perfect solution must require taking as much as possible of the highest-density object.

The second part of the proof is simply applying this concept to a particular case of the lower-density object: empty space with a value of zero. If any highest-density object is left out, you perform the same swap as above, and thereby increase the overall value. That contradicts the fact that some volume is empty while some highest-density object is left out, so you have a similar result.

Finally the continuous knapsack problem is solved optimally by the greedy algorithm, by applying the above proven concept iteratively. In terms of real objects in a real knapsack, think of this as the "pockets answer".

Your initial solution must include taking as much of the highest-density object as you can. If that "fills the knapsack" then you're done; if you've taken all the highest-density object and have empty space left over, then you create a new knapsack problem: ignore the highest-density object and the volume taken up by it. That is, ignore the pockets you've already filled, and treat the remaining pockets as a separate knapsack to fill!

This means the new solution must include taking as much of the new highest-density object as possible. Now repeat the process until either you've got no empty space or you've got no objects left out. Each part has taken the largest value it possibly can, so altogether, the entire knapsack has the highest possible value as well. And then you're done.