# The luckiest coin This week’s Riddler Classic is about finding the “luckiest” coin!

I have in my possession 1 million fair coins. I first flip all 1 million coins simultaneously, discarding any coins that come up tails. I flip all the coins that come up heads a second time, and I again discard any of these coins that come up tails. I repeat this process, over and over again. If at any point I am left with one coin, I declare that to be the “luckiest” coin.

But getting to one coin is no sure thing. For example, I might find myself with two coins, flip both of them and have both come up tails. Then I would have zero coins, never having had exactly one coin.

What is the probability that I will at some point have exactly one “luckiest” coin?

Here is my solution:
[Show Solution]

## 9 thoughts on “The luckiest coin”

1. Guy D. Moore says:

Hi Laurent,
I solved it essentially the same way, but to get out to n=1,000,000 I evaluate the expression
which you write for p_n using the following two tricks:
1) past n=48 I don’t bother subtracting 1 in the denominator for p_n, which is beyond double-precision anyway
2) I calculate the sum term by term. For the n-choose-k coefficient, I calculate it by modifying
the previous coefficient. Every time this coefficient becomes large, I divide the sum-so-far,
the coefficient, and the 2^n factor in front by a common rescaling, to keep them within double-precision range.
Doing that I was able to get to n=1,000,000 where I found p = 0.721353963324

Regarding the oscillations — as you point out, we have that, approximately
p(2n) = p(n)
which ensures periodicity in log_2(n) (log base 2 of n).
Therefore one expects that, for large(ish) n,
p(n) = p_0 + p_1 cos(2 pi log_2(n) – phi_1) + p_2 cos(4 pi log_2(n) – phi_2) + …
let’s try to understand the corrections to this formula — how much does p(2n) differ from p(n)?
At the next-closest level of approximation, p(2n) depends not only on p(n) but on
a narrow range around p(n), with a Gaussian envelope according to the central limit theorem.
The variance (squared width) of the envelope is the variance of 2n coin flips — each
coin flip gives a variance of 1/4, so the total variance is sigma^2 = 2n/4 = n/2.
Therefore p(n’) is sampled in a neighborhood about n’=n with width sqrt(n/2).
Approximating p(n’) with a Taylor series about n’=n,
p(n’) = p(n) + (n’-n) p'(n) + (n’-n)^2/2 p”(n)
we find = 0 but <(n'-n)^2 = (n/2) the variance of the sampling distribution, so
p(2n) = p(n) + (n/4) p''(n)
where p'' means the second derivative, as usual.
The second derivative of the first cosine term gives
2 pi/(n^2 ln(2)) sin(…) – 4 pi^2/(n^2 ln^2(2)) cos(…)
The sin(…) term slightly shifts the phase phi_1, while the -4pi^2/(n^2 ln^2(2)) cos(…)
acts to damp the cosine. Multiplying by the n/4 coefficient in front, the damping is:
new coefficient = old coefficient ( 1 – pi^2 / n ln^2(2) )
Since the damping has a 1/n factor and n doubles with each step, the damping rather quickly
becomes irrelevant — it's order-1 for n = pi^2/ln^2(2) = 20
but it's small for, say, n=100.
On the other hand, the next term in the Fourier series has a damping coefficient
with 4 times the size, so it gets damped strongly for two more oscillations.
This is why I expect that the true form is nearly but not perfectly sinusoidal.

My figure, going out to n=1,000,000, is at:

2. MarkS says:

This paper was mentioned in the Riddler’s Lonesome King solution:
https://arxiv.org/pdf/1507.03805.pdf
Quoting from the first paragraph on p.2: “Suppose we have n coins, each of which lands heads up with probability p. Flip all the coins independently and throw out the coins that show heads. Repeat the procedure with the remaining coins until 0 or 1 coins are left. The probability of ending with 0 coins does not converge as n → ∞ and becomes asymptotically periodic and continuous on the log n scale [6, 11].”

1. Jenny Mitchell says:

Furthermore, cite 11 gives the limiting mean (assuming I’m reading it correctly) as -p / ln(1-p). In our case, that’s 1/ln(4) = 0.72134752044….

https://www.jstor.org/stable/2325134

1. Laurent says:

Thanks! — I didn’t quite understand the approach of the paper you linked, but I played around with the equations a bit more and I found another way to arrive at the result. I added a new section to the blog post.

3. Alex Zorn says:

For a quick “derivation” of the limiting mean:

We can equivalently compute the probability that, if we flip the coins until they are all tails, there is a single coin that survives the longest. For a specific coin out of k total coins, the probability that it survived for n flips (i.e. got n heads followed by a tails) and was the longest surviving coin is:

(1/2)^(n+1) * (1 – (1/2)^(n))^(k – 1)

And so we have a closed form as an infinite sum:

p(k) = k * sum_(n = 1)^(infty) (1/2)^(n+1)* (1 – (1/2)^(n))^(k-1)

If we replace the sum with an integral, taking the limit as k -> infty gives the 1/ln(4) number referenced elsewhere.

Of course, “derivation” is in quotes because the “replacing the sum with an integral” part isn’t really justified.

1. Laurent says:

Right — I saw a similar argument was posted by Allen Gu. I don’t think one can simply replace the sum by an integral.

4. Laurent says:

I added a new section to my write-up, where I prove that the limiting periodic function is not a true sinusoid (but it’s very close), and its asymptotic average value is exactly $\frac{1}{2\log(2)}$.

5. Anton says:

I got Allen Gu’s formula for p_n with a generating functions approach: using the recurrence, we can prove that P(x) = sum ((1 – p_(n+2)) x^n / (n+2)!) satisfies 4P(2x) = (e^x – x – 1)/x^2 + P(x)e^x, and we let P(x) = Q(x)e^x/x^2 to get Q(2x) – Q(x) = e^(-2x) (e^x – x – 1). Now it is easy to find a power series solution and equate coefficients.

Consider S(N) = sum (2/n)*p_n from n=1 to N. Assuming p = lim(p_n) exists, by the harmonic series we should have S(N)/log(N) -> 2p. On the other hand, with the formula we have a telescoping series giving S(N) = sum (1 – (1 – 2^(-k))^N) from k = 0 to infinity. But for large N, the summand is essentially 1 for k < log_2(N) and essentially 0 for k > log_2(N). [Around k = log_2(N), the sequence will be approximately: …, 1 – 1/e^4, 1 – 1/e^2, 1 – 1/e, 1 – 1/e^(1/2), 1 – 1/e^(1/4), …] So the sum is close to log_2(N) and certainly we will have S(N)/log(N) -> 1/log(2).

Hence 2p = 1/log(2) so in some sense the limit of the average value of p_n is indeed 1/(2log(2)).