The Kelly criterion: How to size bets

By Paul Butler – January 27, 2019

Let’s play a game. I have a coin that lands heads 60% of the time and tails 40% of the time. You have $25 and can bet on either side of the coin — every time you’re right you double your bet, and every time you are wrong you lose it.

What would your betting strategy be? Try your luck:

Cash: $258.31

Bet: $

Last flip: won $8.84

Maxed out after 255 flips.

If you reached the maximum reward, congrats! You did better than 79% of participants who played this game for real money in a 2016 experiment by Haghani and Dewey.

Expected Value

Let’s consider a strategy for playing the game.

The standard approach to betting games is to maximize the expected value of each bet, that is, the amount you can expect to win (or lose) on average. In this game the odds are in your favor: every dollar you bet on heads has an expected return of $1.20, so to maximize the expected return on each bet you should bet everything you have on heads on every round.

This strategy might make sense if you can only play for a few rounds, but here’s the problem: as you play many rounds with this strategy, it becomes nearly certain that you will lose everything on one bad bet. Like the dog in Aesop’s fable, greed will have left you with less than you started with.

If maximizing the expected value is going to leave us almost certainly broke, is there a better strategy?

Comparing Strategies

Suppose we are interested in comparing two strategies, which we can call aa and bb. Let’s say that the random variables V_{t,a}Vt,a and V_{t,b}Vt,b represent the winnings of each strategy (respectively) after tt rounds.

How do we know which strategy is better? As we have seen, comparing them by expected value, i.e. mathbb{E}[V_{t,a}]E[Vt,a] to mathbb{E}[V_{t,b}]E[Vt,b], won’t work. Instead, let’s ask a subtly different question: which strategy is more likely to do better than the other after some number of rounds? Symbolically, the value we are interested in is:

P(V_{t,a} > V_{t,b})P(Vt,a>Vt,b)

One way to think about this probability is that if we were to play strategy aa for tt flips, and then start over with another $25 and play strategy bb for tt flips, what is the probability that strategy aa made more? If the probability is over 50%, we’ll say that aa is a better strategy than bb.

Better to Best?

If we have a means of saying that one strategy is better than another, we can define a best strategy as one for which no other strategy is better.

Does such a strategy exist? It’s less obvious than it might seem! For example, consider a (very generous) slot machine where you can pull one of three levers labeled AA, BB, and CC. The payout for AA is drawn uniformly from {4,5,6}{4,5,6}; BB is drawn uniformly from {2,3,9}{2,3,9}; and CC is drawn uniformly from {1,7,8}{1,7,8}.

The tables below show the nine equally-probable outcomes of a draw each pair of these three random variables.

A
456
B24 > 25 > 26 > 2
34 > 35 > 36 > 3
94 < 95 < 96 < 9
P(A > B) = frac{2}{3}P(A>B)=32
B
239
C12 > 13 > 19 > 1
72 < 73 < 79 > 7
82 < 83 < 89 > 8
P(B > C) = frac{5}{9}P(B>C)=95
C
178
A41 < 47 > 48 > 4
51 < 57 > 58 > 5
61 < 67 > 68 > 6
P(C > A) = frac{2}{3}P(C>A)=32

As you can see, there’s no strategy (lever) which beats every other strategy more often than not for a single draw.

Back to Betting

Returning to our coin flip game, we now have two questions:

  1. Does a best strategy exist for coin-flip betting?
  2. If so, what is it?

Before we dive into the analysis, I’d like to make a few observations and assumptions to simplify the math.

First, we are interested in the long-run outcome over a large number of turns. To be mathematically precise, we are interested in the best strategy in the limit as tt approaches infinity.

Second, let’s assume that we’re dealing with an infinitely divisible currency. This way we don’t have to round bets to the nearest cent, which would complicate the math.

Third, let’s limit ourselves to strategies where we bet a fixed fraction ell of our wealth on every flip. Each possible value of ell between 0 and 1 therefore corresponds to a different strategy. Notationally, we will use ell_xx to denote the fixed fraction used by the strategy xx. It turns out that the optimal strategy under this restriction is also the best strategy overall, but the original Kelly paper leaves this as an open question; see Optimal Gambling Systems for Favorable Games for a proof of this.

Simplifying the Math

Note that because of the second and third assumptions above, for a given value of ell, the amount won or lost depends only on the number of bets won and lost, but not the order they are in. For example, if you play four rounds alternating wins and losses, you end up with

begin{aligned} &25 (1+ell)(1-ell)(1+ell)(1-ell) = &25(1+ell)^2(1-ell)^2 end{aligned}=25(1+)(1)(1+)(1)25(1+)2(1)2

This is the same as if you first lose two and then win two:

begin{aligned}&25 (1-ell)(1-ell)(1+ell)(1+ell) = &25 (1+ell)^2(1-ell)^2 end{aligned}=25(1)(1)(1+)(1+)25(1+)2(1)2

More generally, let’s say we run tt trials, winning ww and losing t-wtw. Then the amount we end up with is:

V_{t,x} = 25 (1+ell_x)^{w}(1-ell_x)^{t-w}Vt,x=25(1+x)w(1x)tw

Since the quantity we are actually interested in is P(V_{t,a} > V_{t,b})P(Vt,a>Vt,b) for any strategies aa and bb, consider another quantity:

begin{aligned} W_{t,x} &= frac{1}{25} {V_{t,x}^{frac{1}{t}}} &= (1+ell_x)^{w/t}(1-ell_x)^{1-(w/t)} end{aligned}Wt,x=251Vt,xt1=(1+x)w/t(1x)1(w/t)

W_{t,x}Wt,x is a useful variable because P(V_{t,a} > V_{t,b}) = P(W_{t,a} > W_{t,b})P(Vt,a>Vt,b)=P(Wt,a>Wt,b).

(This is true because neither dividing by 25, nor taking the tt root, changes the order of those values. More precisely, W_{t,x}Wt,x is a monotonic function of V_{t,x}Vt,x.)

Maximizing W_{t,x}Wt,x

Notice that W_{t,x}Wt,x depends on just two values: the value of ell_xx and the fraction of the rounds that we win, w/tw/t. To develop an intuition for the relationship between the three values, let’s plot w/tw/t on the left axis vs. W_{t,x}Wt,x on the bottom axis, for two different values of ell.

ell_a = 0.4a=0.4

ell_b = 0.5b=0.5

Each point on the bottom axis represents a possible realization of the random variable w/tw/t, ranging from the case where the coin comes up tails on every flip (the left edge of the plot) to the case where the coin comes up heads every time (the right edge of the plot.) Not all of these values are equally likely. Since heads has 60% probability, the black vertical line at w/t = 0.6w/t=0.6 represents the median value of w/tw/t. In other words, if we play the game for tt rounds, the resulting value of w/tw/t is as likely to be to the left of that line as it is to the right of it. (This line is actually an approximation of the median, but it becomes an increasingly accurate one as tt becomes large.)

Using this fact, we can determine whether P(W_{t,a} > W_{t,b}) > 0.5P(Wt,a>Wt,b)>0.5 just by looking at the chart: if (and only if) the curve for W_{t,a}Wt,a intersects the median line above the curve for W_{t,b}Wt,b, then P(W_{t,a} > W_{t,b}) > 0.5P(Wt,a>Wt,b)>0.5.

Consider why this is true. As long ell_a neq ell_bab, the curves intersect for only one value of w/tw/t. If that point is right of the median, then whichever curve is greater at the median is also greater to the left of it. Since, by definition, realizations of w/tw/t will be equal to or greater than the median 50% of the time, we know that whichever curve is greater at the median is greater at least 50% of the time (represented by the green line.) But this curve is also greater between the intersection and the median (represented by the yellow line), so we can actually say the curve that is greater at the median is greater more than 50% of the time.

A symmetric argument applies if the intersection is to the left of the median. If the curves intersect at the median, neither strategy is considered better than the other.

Piecing it Together

By putting together all the pieces we have seen so far, we know that P(W_{t,a} > W_{t,b}) > 0.5P(Wt,a>Wt,b)>0.5 if and only if mathrm{median}(W_{t,a}) > mathrm{median}(W_{t,b})median(Wt,a)>median(Wt,b).

Since P(W_{t,a} > W_{t,b}) = P(V_{t,a} > V_{t,b})P(Wt,a>Wt,b)=P(Vt,a>Vt,b), we now have a way to say whether or not strategy aa is better than bb just by comparing their medians!

This is handy for two reasons.

  1. It proves that a best strategy exists. We can’t have a situation like we did with the slot machines where aa is better than bb and bb is better than cc but cc is better than aa, because that would imply that mathrm{median}(W_{t,a}) > mathrm{median}(W_{t,b}) > mathrm{median}(W_{t,c}) > mathrm{median}(W_{t,a})median(Wt,a)>median(Wt,b)>median(Wt,c)>median(Wt,a) which is not possible.
  2. It gives us an easy way to find the best strategy: find the strategy that has the maximum median return.

Maximizing the Median

The plot below shows the median return for different values of ell. The optimal strategy is to bet the value of ell that maximizes the median return, shown as a dot.

I’ve also included the curve for a strategy that bets tails. See what happens to each curve as you change the probability of heads, first to 50%, and then to below 50%.

When the probability of heads is 60%, the highest median return after ten bets is achieved by betting heads with ell = 0.2=0.2, yielding a median return of $30.58.

Adjust the probability of heads:

As you can see, when the probability of heads is 60%, the ideal strategy is betting 20% of your wealth. The game below is just like the one you played earlier with the addition of a slider that lets you set the ell paramter and automatically calculates the bet based on it. See what happens when you use the optimal strategy.

Cash: $25.00

Autobet: ell ==

Bet: $

Choose heads or tails to place your first bet.

Reach $250.00 to max out.

Notice that there is a direct relationship between the probability of heads and the amount you should bet. In general, the optimal strategy in games like this that pay out 1:1 is to subtract 0.5 from your odds and double the result, then multiply that by your current wealth and make that your bet. (See the appendix for an analytical derivation of this.)

Fin

To recap, we’ve seen that:

  • Maximizing expected value can lead to a bad long-run strategy.
  • Rather than maximizing expected value, we may be able to find a strategy which returns more than any other strategy at least half the time.
  • But not always! For some sets of random variables, whichever one you pick there is always one "better," meaning that there is no "best."
  • In the case of our betting game, a strategy is better than another if it has a higher median.
  • This means that we can, for the betting game, find a best strategy for which there is no better.
  • We do this by maximizing the median outcome with respect to the size of our bet, ell.
  • The resulting optimal bet for this type of game, as a fraction of wealth, is ell = 2 (p - 0.5)=2(p0.5).

And with that result, we’ve arrived at our destination. This formula (actually, a slightly more general version of it) is commonly known as the Kelly Criterion after J. L. Kelly, Jr. In his original analysis, Kelly took a more direct route than we did to the solution, but I like to think of our path as the scenic route: this approach to the problem helped me develop an intuition for some of the more paradoxical aspects of working with probability and infinite series, and I hope it did for you as well. Happy betting!

Appendix: An Analytical Solution

To find the optimal strategy, we want to find the value of ell that maximizes the median value of (1 + ell)^{w/t} (1 - ell)^{1-(w/t)}(1+)w/t(1)1(w/t).

This is the same as maximizing frac{w}{t}mathrm{log}(1 + ell) + (1 - frac{w}{t})mathrm{log}(1 - ell)twlog(1+)+(1tw)log(1).

We know that in the limit, mathrm{median}(w/t)median(w/t) is just the probability of heads. Let’s call this value pp so that the value we are interested in maximizing is pmathrm{log}(1 + ell) + (1 - p)mathrm{log}(1 - ell)plog(1+)+(1p)log(1).

To maximize this, we take the derivative and set to zero:

begin{aligned} 0 &= frac{p}{1 + ell} - frac{1 - p}{1 - ell} &= frac{p(1 - ell)-(1 - p)(1 + ell)}{(1 + ell)(1 - ell)} &= p - pell - 1 - ell + p + pell &= 2 p - 1 - ell end{aligned}0=1+p11p=(1+)(1)p(1)(1p)(1+)=pp1+p+p=2p1

This gives us the value of ell which maximizes the median return:

ell = 2 (p - 0.5)=2(p0.5)