Flipping your way to victory

This week’s Riddler Classic concerns a paradoxical coin-flipping game:

You have two fair coins, labeled A and B. When you flip coin A, you get 1 point if it comes up heads, but you lose 1 point if it comes up tails. Coin B is worth twice as much — when you flip coin B, you get 2 points if it comes up heads, but you lose 2 points if it comes up tails.

To play the game, you make a total of 100 flips. For each flip, you can choose either coin, and you know the outcomes of all the previous flips. In order to win, you must finish with a positive total score. In your eyes, finishing with 2 points is just as good as finishing with 200 points — any positive score is a win. (By the same token, finishing with 0 or −2 points is just as bad as finishing with −200 points.)

If you optimize your strategy, what percentage of games will you win? (Remember, one game consists of 100 coin flips.)

Extra credit: What if coin A isn’t fair (but coin B is still fair)? That is, if coin A comes up heads with probability p and you optimize your strategy, what percentage of games will you win?

Here is my solution:
[Show Solution]

15 thoughts on “Flipping your way to victory”

  1. I have some intuition for the 2/3 asymptote. It seems to me you should place the small bet if and only if you have positive points. I don’t have a rigorous proof of this, but I was able to show it programatically: https://github.com/joshuameisel/joshuameisel.github.io/blob/master/code/flipping_game.ipynb

    The hand-wavy reasoning is that if you’re ahead, if you can stay ahead for the rest of the game you win, which is maximized by betting small. If you do end up back at 0, you’re essentially resetting. When you’re losing, you need to eventually recoup your losses and “reset,” and the best way to guarantee this happens is by betting big.

    With this strategy you start off betting big for the amount of time it takes a random walk to reach 1. At this point you will have 2 points and start betting small, for the amount of time it takes a random walk to reach -2, at which point you start over betting big with 0 points again. The expected amount of time betting small is sort of twice the expected amount of time betting big, as you need to reach -1 in your random walk and then -1 again. This again is hand-wavy since both expectations are actually infinite, but maybe hints at why you’re twice as likely to be betting small (i.e. winning) at the end.

    1. Here’s my write-up. I give a proof that you should bet small only when you’re ahead (curious if there’s a simpler proof!) and show that the solution is P(M_100 != 0 mod 3) where M_n is the running maximum of a random walk.

      I also give an alternative proof of the 1/3 limit, which I think is a bit more rigorous (the object is technically not a Markov chain. For instance, you have a 1/2 chance of initially returning to 1 but this decreases over time).

      https://joshuameisel.github.io/Flip-To-Victory/

      1. Nice! I actually edited my solution a couple days ago to include a proof of the 2/3 limit by treating it as a Markov chain. Though your proof may be more rigorous than mine…

  2. I did this with pen and paper for up to 9 flips, and some interesting patterns emerged. As for strategy, in the vast majority of cases it does not matter if you choose coin A or coin B. The only times it matters:

    If points>1 and flips+points = 1 (mod 2) you need to flip coin A
    If points=0 point with possibility converging towards 2/3 AND ending at =0 points with possibility =N/2). The first is probably a classic proof about random walks.

    Both of these paths hinge on proving W_N(1) => 2/3. Joshuas observation about random walks seems to be useful for this purpose

  3. Hi Laurent,
    So instead of looking for the chance of WINNING, look instead for the chance of NOT LOSING.
    This is the same as assigning a value of 1, rather than 0, to ending with a score of 0.
    Then for a game of length L, the outcome, when both coins are fair, is:
    L=0: 1
    L=1: 1 – 1/2
    L=2: 1 – 1/2 + 1/4
    L=3: 1 – 1/2 + 1/4 – 1/8
    L=4: 1 – 1/2 + 1/4 – 1/8 + 1/16
    L=5: 1 – 1/2 + 1/4 – 1/8 + 1/16 – 1/32 = sum(i=0..L) (-1)^i / 2^i
    and the series exponentially converges to 2/3.
    The probability to draw goes to zero as the length of the game gets longer
    (though not exponentially, only as a half-power), so the chance to win converges
    towards the chance to not lose. Which converges rapidly towards 2/3.

    Now we just need a proof, rather than a mere observation, of the above behavior.

  4. I posted an update to include an explanation for why the limit as $N\to\infty$ is $\tfrac{2}{3}$. I also included a formula that was discovered and posted on Twitter. We still don’t have a proof for it, though it does match the analytical expression at $N=100$ exactly so it’s probably correct.

    1. An intuitive and almost rigorous argument for the 2/3 limit is this: imagine that you have infinite tosses. On average, you spend twice as much time with a positive score as you do with a negative one (since by tossing A you’re effectively moving half as fast as you do when tossing B). Now think of the end time as randomly distributed. Then it should have twice as much probability of catching you when you’re positive as when you’re negative, so you’re probability of winning is 2/3 (there’s still a bit of work to do to convert it to a statement about a fixed time but it’s close).

      1. @Bora, I don’t think this argument works. Just because you’re moving “twice as fast” when you’re negative doesn’t necessarily mean you spend half as much time there.

        Consider for example the probability of returning to your starting point for the first time after $2m$ flips. This is known as a “first return” in random walks, and it turns out to be $\tfrac{1}{2^{2m}(2m-1)}\binom{2m}{m}$. This probability does not depend on whether you’re flipping coin A (moving $\pm 1$) or flipping coin B (moving $\pm 2$); it’s the same in both cases!

        The reason for the discrepancy in time comes from what happens near zero. When we transition from a losing score to a winning score, we jump to $+2$ (so we have to lose the next two flips in a row before we have a losing score again). But when we transition from a winning score to a losing score, we jump to $0$ (so we only have to win the next flip before we have a winning score again). This is the reason why we end up spending more time on the winning side. It has nothing to do with the fact that we take bigger jumps on one side.

  5. Hi Laurent,
    I know your work through the Riddler puzzles, and find it very impressive.

    However, I am wondering if I am missing something, or you are making one incorrect claim above.
    If Coin A has 0 probability of being heads, then you only flip coin B.
    Coin B will give 50 heads (losing) with a probability of about .08 . (you can compute it exactly or get a good estimate of sqrt(2/pi) using Stirling’s approximation of the factorial.

    This means you only win about 46% of the time when flipping a single fair coin.

    1. It seems that after 100 flips of a single fair coin, a score of +N is just as likely as a score of -N. However, since you only win if the score is positive, the score of 0 is like the green space on a roulette wheel. It ensures that, in the long run, the house always comes out ahead.

    2. Hi Aaron,
      The problem with your argument is that, if you are winning by enough, it’s best to flip the A coin EVEN IF it always loses. If you have M flips left to make, and your score is at least M+1, then you can only flip coin A for the rest of the time and win with 100% certainty, rather than risk that a series of bad flips will push you negative.

      The simplest example is if the game is only 2 flips long.
      If you always flip the 2-point coin, you have: 1/4 chance to end with +4, 1/2 chance to end with 0, and 1/4 chance to end with -4, total (1/4) chance to win.
      If you flip each coin once, you have (1/2) chance to end with +1 and (1/2) chance to end with -3. So you win 1/2 of the time. Which is clearly better, even though the expected score is worse.

  6. There is a very nice closed form expression for the probability of winning with if you start with a score of 1 rather than a score of 0.

    p = 1 – 1/2 + 1/4 – 1/8 + 1/16 – 1/32 … = 2/3

    The Nth partial sum is the probability of winning with N turns.

    Not completely sure why yet. It’s a much nicer expression than the one starting at 0, but they converge to the same thing. Trying to figure out why, and how to map one on to the other.

    1. Just realized my solution is pretty much the same as Guy Moore above.

      The solution for starting at 0 ends up looking like

      p = 1/2 + 0/4 – 0/8 + c1/16 – c1/32 + (c1+c2)/64 – (c1+c2)/128 + (c1+c2+c3)/256 – ….

      where c_i are the Catalan numbers. This would make sense as being related to the number of ways of getting back to zero.

  7. Just realized my solution is pretty much the same as Guy Moore above.

    The solution for starting at 0 ends up looking like

    p = 1/2 + 0/4 – 0/8 + c1/16 – c1/32 + (c1+c2)/64 – (c1+c2)/128 + (c1+c2+c3)/256 – ….

    where c_i are the Catalan numbers. This would make sense as being related to the number of ways of getting back to zero.

  8. So here is a different way of approaching the problem (where each coin is fair), and a proof that the chance to win or tie, after f flips, is
    (2/3) + (-1/2)^f / 3
    We adopt the optimal strategy of flipping for 2 points if our current score is 0 or less, and flipping for 1 point if our score is positive. Rather than work from the end backwards, I will work from the first flip forward, and ask:
    After f flips, how many ways can I end up with a score of s? I call the answer W(f,s)
    The probability to have a score of s after f flips is then W(f,s) / 2^f.
    The initial conditions are W(0,0)=1, W(0,s)=0 if s is not 0.
    Our strategy tells us that
    A) W(f,-2m+1) = 0 you never get an odd negative score,
    B) W(f+1,-2m) = W(f,-2m+2) + W(f,-2m-2) for all positive integers m
    C) W(f+1,0) = W(f,1) + W(f,-2) (You get to 0 either from +1 or from -2)
    D) W(f+1,1) = W(f,2) the ONLY way to get to score=1 is to come down from score=2
    E) W(f+1,2) = W(f,3) + W(f,1) + W(f,0) (you get to score=2 from 3, from 1, OR from 0)
    F) W(f+1,m+2) = W(f,m+1) + W(f,m+3) for positive m.

    We now prove the following inductively:
    1) W(f+1,1) = W(f,2)
    2) W(f+1,-2m) = W(f,m) for m >= 2
    3) W(f,0) = W(f,-2) + (-1)^f
    Property 1 is the same as D)
    Property 2 follows inductively very easily for m >= 3. For m=2, we note that:
    W(f+1,-4) = W(f,-6) + W(f,-2) by B)
    = W(f-1,3) + W(f-1,-4) + W(f-1,0) by 2) and B)
    = W(f-1,3) + W(f-2,2) + W(f-1,0) by 2)
    = W(f-1,3) + W(f-1,1) + W(f-1,0) by 1)
    = W(f,2) by E)
    which completes the inductive proof of B).
    To prove C), note that
    W(f+1,0) = W(f,1) + W(f,-2)
    while
    W(f+1,-2) = W(f,-4) + W(f,0) = W(f-1,2) + W(f,0) = W(f,1) + W(f,0)
    The coefficients switch places and increase by W(f,1).
    Since W(0,0) starts out 1 larger than W(0,-2), they will alternate being 1 larger,
    completing the proof of C)

    Now consider the total number of ways to get a nonnegative score, MINUS TWICE the number of ways to get a negative score:
    A(f) = sum(m=2,..) W(f,m) + W(f,1) + W(f,0) – 2 W(f,-2) – 2 sum(m=2,..) W(f,-2m)
    = ( Probability to win or tie – 2*probability to lose ) * 2^f
    For shorthand, write Probability to win or tie as Pwt and probability to lose as Pl = 1 – Pwt.

    Now let’s massage the form of A(f):
    Replace W(f,2) = W(f,4) + W(f,1) + W(f,0) and W(f,m>2) with W(f-1,m-1) + W(f-1,m+1):
    sum(m=2,..) W(f,m) = 2 sum(m=2,..) W(f-1,m) – W(f-1,2) + W(f-1,1) + W(f-1,0)

    Because of B), 2 sum(m=2,..) W(f-1,m) = 2 sum(m=2,..) W(f,-2m) so these cancel, leaving
    A(f) = – W(f-1,2) + W(f-1,1) + W(f-1,0) + W(f,1) + W(f,0) – 2 W(f,-2)
    But W(f-1,2) = W(f,1) by 1), so these terms cancel:
    A(f) = W(f-1,1) + W(f-1,0) + W(f,0) – 2 W(f,-2)
    Use W(f-1,1) = W(f-2,2) = W(f-1,-4) from 1) and 2):
    A(f) = W(f-1,-4) + W(f-1,0) + W(f,0) – 2 W(f,-2)
    But W(f-1,-4) + W(f-1,0) = W(f,-2) by B), so I get
    A(f) = W(f,0) – W(f,-2) = (-1)^f by C)
    2^f (Pwt – 2*Pl) = (-1)^f
    But Pl = 1 – Pwt, so
    3 Pwt – 2 = (-1)^f/2^f
    Pwt = (2/3) + (-1/2)^f / 3
    The probability to win or tie converges to (2/3) exponentially, with alternating corrections.

Leave a Reply

Your email address will not be published. Required fields are marked *