Bug in proof of Kelly Criterion and danger of expectation

expected valueprobabilityprobability theory

I tried to derive Kelly Criterion and was stuck somehow below.

Let's recall the setup first, set the odd to be $b$ (get back $b$ dollars for every dollar bet) and the probability of win to be $p$. Say I will bet on $f$ fraction of my net worth each time.

Then, say if I have one dollar to start with, with probability $p$, I have

$1+bf$

back. And with probability $1-p$, I have

$1-f$

left.


Now, I may define a sequence of independent random variables $X_i$, where $X_i = \begin{cases} 1, & \mbox{with probability $p$}\\
0, & \mbox{with probability $1-p$} \end{cases}$
.

Then, I can combine the above and write my net-worth after the first bet as

$W_1 = (1+bf) X_1 + (1-X_1)(1-f).$

Similarly, my net-worth after two bets is

$W_2 = ((1+bf) X_1 + (1-X_1)(1-f))((1+bf) X_2 + (1-X_2)(1-f)).$

Consequently, after $N$ bets, my net-worth should be

$W_N = \prod_{i=1}^N [(1+bf) X_i + (1-X_i) (1-f)]$.

Note that since each bets are independent, $X_1,\cdots, X_N$ are all independent, so are $[(1+X_i b)^{p} (1-X_i)^{1-p}]$, $\forall i$. Thus

$E[W_N]=E \left[ \prod_{i=1}^N [(1+bf) X_i + (1-X_i) (1-f)\right]=
\prod_{i=1}^N E[(1+bf) X_i + (1-X_i) (1-f)]=[(1+bf) p + (1-p) (1-f)]^N.$

To maximize $E[W_N]$, we should just maximize $(1+bf) p + (1-p) (1-f) =
f(bp-(1-p))+1$
. This suggests all-in whenever $bp > 1-p$.


Of course, I realize the "correct" derivation should approximate $ W_N$ as

$\hat W_N= (1+bf)^{Np} (1-f)^{N(1-p)}$ instead.

And if we maximize $\hat W_N$ with respect to $b$, we will get the correct Kelly Criterion of $\frac{bp-(1-p)}{b}$. Just I don't see any problem with the "incorrect" argument above.

Best Answer

The point of the Kelly criterion has never been to maximize expected wealth. In fact, the point is precisely not to do that, because maximizing expected wealth (somewhat paradoxically) leads to almost sure ruin.

Your calculation is wrong, because it maximizes the expected wealth. You get the only answer one can reasonably expect: if you have an edge, bet it all... after all, every little bit extra you bet leads to more expected value. Of course the problem is that if you continually bet it all, you are certain to eventually lose it all. Despite this obvious limiting case argument, it might still seem counterintuitive at first that maximizing expected value can lead to sure failure. What happens in general is in vanishingly rare universes where you don't lose nearly all of your wealth, you are exceedingly wealthy, so there is still a large expected value even if ruin is assured.

$(1+bf)^{Np}(1-f)^{N(1-p)}$ is not an approximation for the expected wealth. Rather, it is $e^{E(\log W)}.$ The Kelly criterion says you should choose your bet size maximize expected log wealth, not expected wealth, so maximizing this is using the Kelly criterion. The reason for this is that it leads to a long run growth rate that is optimal in a certain sense.

The easiest way to see this is to observe that logs transform a product to a sum, so if $W_N = \prod_{i=1}^N S_i$ then $\log(W_N) = \sum_i \log(S_i)$ and in the long run, the mean of $\log(W_N)$ converges almost surely to $E(\log(S_i)).$ Thus $W_N^{1/N}$ converges almost surely to $e^{E(\log S_i)}.$ Notice that if you set $f$ too high, $E(\log(S_i))$ will turn negative, even if $E(S_i)>1$ and so wealth converges to zero almost surely.

On the other hand if we set $f$ in a sweet spot where $E(\log(S_i))>0$ (which is always possible provided we have an edge) then $W_N$ converges to infinity almost surely, and setting it to maximize is optimal. However, there can still be really bad swings in finite sample, and of course we rarely have a situation where we have a precisely calculable edge for an indefinite amount of time. Thus it's hard to say it's really the best strategy.

Still, it's a heuristic that has some limited value... even if all it does is save us from replacing common sense (going all in on every hand where I have an edge means I walk away every night a loser) with plausible but incorrect mathematical arguments (that strategy has to work cause maximizing expected value means I win in the long-run cause law of large numbers), that's still a win.