Optimization – Proving and Generalizing the Camels and Bananas Problem Solution

optimizationpuzzle

I found the video "YAHOO Interview Puzzle || Camel and Bananas || Logic + Optimization" on the LOGICALLY YOURS YouTube channel.

Here's the essence of it:

  • you must transport bananas from an initial location to a market 1000 km away
  • you can only use a camel for transporting them
  • the camel requires (= consumes) 1 banana for each km it travels, and can carry at most 1000 bananas at any one time
  • if you have 3000 bananas to start with, at the initial location, what is the max number of bananas you can eventually get to the market?

In the video, they address it using a lot of heuristics / not much explanation, which is the part that I am trying to figure out / reduce to a more systematic or formal approach.
Here is their approach:

You need to transport the bananas using 2 intermediate stops between the initial location and the market (proof?). Starting a forward journey with fewer than 1000 bananas is wasteful (proof?). Therefore, 5 trips are necessary between the initial storage and the first stop (proof?), and 2000 bananas must be at the first stop when the initial storage is empty, which implies the first stop must be at 200 km (this they actually do prove). Therefore, 3 trips are necessary between the first stop and the second stop (proof?), and 1000 bananas must be at the second stop when the first stop is empty, which implies the second stop must be at 533 km, and the final answer is $1000 – (1000 – 533) = 533$ bananas at the market (this is actually wrong under the assumption that no 'fractional' bananas can be consumed, as it would leave 1 banana at the second stop; the correct answer is 534 km, leaving 532 bananas at the market).

I've been trying to frame this problem in several ways, e.g. as a linear program (see below), but I keep circling back to having to use the assumptions or heuristics they used in the video.
Is it possible that one cannot solve this without making all these 'guesses'? Is there a more 'formal' way of tackling this? And especially is this problem a 'known' one in mathematics, with an 'official' solution or approach that guarantees its optimality?


PS : describing here my (failed) attempts to solve this without guessing.

One interesting aspect of this problem, I find, is that without any intermediate stops one would have to travel 5000 km to empty the initial storage (and it couldn't even be done, as one would be stuck without fuel after the first forward trip of 1000 km with 1000 bananas).
The use of intermediate stops not only allows to carry at least some bananas to the destination, but also drastically reduces the total distance travelled.
This made me realize that in fact this is also a minimization of the total distance travelled, because the number of bananas left at the end is 3000 minus that.

So my first naive attempt (under the assumption that 5 trips were required to carry all the bananas to the first stop) was to look for a value of $x$ (distance of the first stop from the start) that maximized the product of the forward distance reached by the number of bananas left:

$x \cdot (3000 – 5 \cdot x)$

This turns out to be $x = 300$, but leaves only 1500 bananas at the first stop.
Repeating the same approach for $y$ (distance of the second stop from the first stop), i.e. maximizing $y \cdot (1500 – 3 \cdot y)$, results in $y = 250$ and 750 bananas left.
450 km still need to be covered in the last forward trip, leaving 300 bananas overall.
Much fewer than the 532 they got by their heuristic method.

enter image description here

enter image description here

My second idea was to find $x$ such that the number of bananas left at the first stop would be maximal, when all the initial stock had been moved.
Obviously this didn't work, as $x = 1$ would be the best solution, leaving 2995 bananas, but then regardless of the value of $y$, the final number of bananas would be much lower than 532.

This made me think that the values of $x$ and $y$ cannot be determined independently (i.e. one cannot solve this problem by finding a 'best' value for $x$, ignoring all the rest, and then doing the same for $y$).
But I am not sure of this conclusion, and I cannot find an expression to link $x$ and $y$.

Unless I am mistaken, the number of trips one needs to make to remove a total of $N_0$ bananas, by taking $n$ bananas at a time, is:

$2 \cdot \lceil \frac {N_0} n \rceil – 1$

As bananas are consumed at each trip, it would seem advantageous to keep $n$ as large as possible.
This seems to prove the video's assumption that the maximal possible number of bananas should be taken on any forward trip, i.e. $n = 1000$, and to imply that $5$ trips (3 forward and 2 backward) must be used to empty the initial storage.
At that point, however, $x$ is not known yet, and it is not clear to me what implies that one should choose $y$ so that only 3 trips are left to make.

If I assume that up to 3 trips must be made (2 forward, 1 backward), transporting $n_1, n_2$ bananas, respectively, in the forward trips from $x$ to $x+y$, and that a single final forward trip needs to be made from $y$ to the market, then for a given choice of $y$:

enter image description here

So this could potentially be seen as a linear program in $x, y, n_1, n_2$:

$argmax \ n1 + n2 – 3 \cdot y – (1000 – x – y)$

with several constraints (I am probably putting some redundant ones in) :

$x \ge 1$

$y \ge 1$

$n_1 \le 1000$

$n_2 \le 1000$

$3000 – 5 \cdot x – n_1 – n_2 = 0$

$n_1 – 2 \cdot y \ge 1$

$n_1 + n_2 – 3 \cdot y \ge 1$

$n_1 + n_2 – 3 \cdot y \le 1000$

This works (solved using R's lpSolve).
It yields $x = 200, y = 334, n_1 = n_2 = 1000$.

But is it the correct approach?
And how do I know that it is best to have only stop $x+y$ between $x$ and the market, rather than 2 or more stops?

If I had a different initial number of bananas, and/or a different maximal carrying capacity, and/or a different total distance, I would be left guessing, as I have developed no 'method' to tackle this.

Any ideas/suggestions?


EDIT based on Paul Sinclair's suggestions

Let:
$m \in \mathbb Z^+$ = maximal # bananas that can be carried at any one time by the camel
$x_0 = 0$ (position of the initial storage)
$x_i \in \mathbb Z^+, \forall i \ge 1$ = distance between stop $i-1$ and stop $i$
$b_0 \in \mathbb Z^+$ = # bananas at the initial storage at the start
$b_i, \forall i \ge 1$ = # bananas left at stop $i$ when stop $i-1$ is empty
$D\in \mathbb Z^+$ = distance of the market from the initial storage
$\sum_{i=1}^k = D$ (meaning that the market is exactly at stop $k$)

Assuming that the initial storage and all intermediate stops before the final destination (market) must be empty at the end, what are the required number of stops $k$, and the corresponding $k$ values of $x_i$, which result in the maximal $x_k$ ?

As per Paul's suggestion (except for the indices):

$b_i = b_{i-1} – (2 \cdot \lceil \frac {b_{i-1}} m \rceil – 1) \cdot x_i$

[editing halted – apparently the hypotheses I am using are incorrect / do not accurately represent the linked Youtube problem]

Best Answer

To show that the given solution is optimal, let me add a tracking mechanism: I will suppose that as the camel goes from kilometer marker $n$ to $n+1$, a banana peel is left behind at $n+0.2$; and for the reverse trip from $n+1$ to $n$, a banana peel is left behind at $n+0.8$. Now, suppose at the end of any process, there is exactly one banana peel at $0.2$. Then the camel can only have taken at most 1000 bananas from 0 to 1, so there are at least 2000 bananas (or their peels) left at kilometer marker 0 or less. Therefore, at most 999 bananas (or their peels) end up at kilometer marker 1 or greater. Similarly, if there are two banana peels at $0.2$, then there is at least one banana peel at $0.8$, and there are at least 1000 bananas left at kilometer marker 0 or less. Therefore, in this case, at most 1997 bananas end up at kilometer marker 1 or greater. Finally, if there are three or more banana peels at $0.2$, then there are at least two banana peels at $0.8$, so at most 2995 bananas end up at kilometer marker 1 or greater in this case. (And if there are no banana peels at $0.2$, then it is impossible for any bananas to end up at kilometer marker 1 or greater.) Combining these cases, we see that in any case, at most 2995 bananas end up at kilometer marker 1 or greater.

Now similarly, at most 2990 bananas will end up at kilometer marker 2 or greater, and so on until we can conclude at most 2000 bananas will end up at kilometer marker 200 or greater. From this point, you can make a similar argument to before, but with fewer cases, to show that at most 1997 bananas will end up at kilometer marker 201 or greater. Continuing, we get that at most 1001 bananas will end up at kilometer marker 533 or greater.

At this point, we have a special edge case to argue, and you will get that at most 999 bananas end up at kilometer marker 534 or greater. From here on, it's easy to argue that at most 998 bananas end up at kilometer marker 535 or greater, etc., so at most 533 bananas end up at kilometer marker 1000 or greater. As an easy corollary, at most 533 bananas end up at the market exactly at kilometer marker 1000. (And contrary to what you wrote, 533 is in fact achievable, if you use one staging point at 200 km, a second staging point at 533 km, and leave one banana behind at 533 km.)

Related Question