Let’s define $p_n$ to be the probability that we will end up with exactly one “luckiest” coin, given that we start with $n$ coins. Our terminal conditions are that $p_0=0$ and $p_1=1$, since we lose if we end up with zero and we win if we end up with one. We can define the rest of the $p_n$ recursively as follows. If we flip the $n$ coins and obtain $k$ heads, then the probability we will win is $p_k$. The probability of obtaining $k$ heads is $\binom{n}{k}\frac{1}{2^n}$. Therefore,
\[
p_n = \sum_{k=0}^n \frac{1}{2^n} \binom{n}{k} p_k\qquad\text{for }n=2,3,\dots
\]Note that $p_n$ appears on both sides of the equation, since it’s possible that we flip all Heads, and we remain with $n$ coins at the next turn. We can solve for $p_n$ by rearranging the equation, and we obtain the complete recursion:
$\displaystyle
\begin{aligned}
p_0&=1,\quad p_1=1,\\
p_n &= \frac{1}{2^n-1} \sum_{k=0}^{n-1}\binom{n}{k} p_k\qquad\text{for }n=2,3,\dots
\end{aligned}
$
Let’s start with a simple simulation to see what we’re dealing with. Here is what we get for $n$ up to 50:

Seems like we’re done! The sequences appears to settle quickly. A numerical evaluation shows the limit to be approximately 0.72135, or 72.1%.
Down the rabbit hole
Appearances can be deceiving! First, we can expect the number of coins to roughly drop by a factor of 2 at each turn, so we should expect that $p_n \approx p_{2n}$ for large $n$. This suggests that we should plot $n$ on a log scale. Let’s see what happens if we make the same plot as before, except this time go up to $2^{12} = 4{,}096$, and we ignore the initial large swings in values to just focus on the tail end.

Would you look at that! The plot tends to a sinusoid. And although the amplitude decays rapidly at first, it does not decay to zero! If you have been following the Riddler for some time, this may seem familiar to you. In fact, a very similar pattern of decaying sinusoids on a log scale that don’t quite decay to zero was observed in the puzzle of The lonesome king, published about five and a half years ago!
It might be possible to actually compute all the way up to $10^6$, but it would likely take quite awhile (on my laptop, anyway). Computing up to $n=5000$ already took over one minute. And the scaling is not linear! Computing each $p_n$ involves all previous $p_k$, so the computation scales like $n^2$. Moreover, the binomial coefficients become so large that computing them exactly would take a lot of time and memory, and approximating them would further complicate the solution.
To approximate the probability when $n=10^6$, the easiest way is to leverage the periodicity of the function and reduce the quantity of interest to one that is within the range of values we have already computed. We have that:
\[
p_{10^6} = p_{10^6/2^8} \approx p_{3096} \approx 0.72134685.
\]Granted, the limiting amplitude of the sinusoid is so small that even if we use our original guess of 0.72135, we will be correct to 5 decimal places for any $n$ sufficiently large. So at the end of the day, we don’t gain much by doing all this extra work.
As with the Lonesome King puzzle, we are once again faced with this puzzling phenomenon of marginally stable oscillations. There are several questions that seem natural to ask:
- Is there a closed-form expression for the limiting amplitude?
- Is there a closed-form expression for the limiting mean?
- Is the limiting waveform actually a true sinusoid?
To answer these questions, we have to go deeper…
Deeper down the rabbit hole
After more thought (and many great comments here and on Twitter), I decided to take another crack at the problem. I started with the approach used by Allen Gu, where he derived a non-recursive formula for $p_n$:
\[
p_n = \frac{n}{2}\sum_{t=0}^\infty 2^{-t} \left( 1-2^{-t} \right)^{n-1}
\]Define the inside function as
\[
g_n(t) = \frac{n}{2}2^{-t} \left( 1-2^{-t} \right)^{n-1}
\]Here is what $g_n(t)$ looks like for different values of $n$:

It appears that $g_n(t)$ gets shifted to the right as we increase $n$. The height and width of the peak appears to remain constant. To confirm this mathematically, we can solve $\frac{\mathrm{d}}{\mathrm{d}t}g_n(t) = 0$ and we find that the peak occurs at $t = \log_2(n)$. Since our plan is to sum $g_n(t)$ over all $t$ anyway, then we might as well shift it so that it is centered at its peak. If the offset from the peak is $\tau$, then the shifted function is:
\begin{align}
h_n(\tau) &= g_n(\log_2(n) + \tau) \\
&= 2^{-\tau-1} \left(1-\frac{2^{-\tau}}{n}\right)^{n-1}
\end{align}We can evaluate the limit as $n\to\infty$, and we see that:
\[
h(\tau) = \lim_{n\to\infty}h_n(\tau) = 2^{-\tau-1} e^{-2^{-\tau}}
\]It follows that in the limit of large $n$, the probability $p_n$ will take on all values in the range
\[
p(x) = \sum_{\tau=-\infty}^\infty 2^{-\tau-1+x} e^{-2^{-\tau+x}}\quad\text{for }x\in[0,1]
\]Of course, due to the translation invariance of the sum, this must be a periodic function. Here is what it looks like:

It looks very close to a sinusoid. However, it is not. If it were a true sinusoid, then $p(x)+p(x+\tfrac{1}{2})$ would be constant for all $x$. We can verify that this is not the case by doing an accurate numerical evaluation of the sums for two different values of $x$. We find:
\begin{align}
\tfrac{1}{2}(p(0.3)+p(0.8)) &\approx 0.72134752045108504298 \\
\tfrac{1}{2}(p(0.5)+p(1.0)) &\approx 0.72134752043912493759
\end{align}So they differ in the $11^\text{th}$ decimal digit.
Given that the function $p(x)$ is not a true sinusoid, we have to re-examine what we mean by “limiting value”. One way to define this would be the average value over one period. Thankfully, this integral can be evaluated exactly, and we obtain:
\begin{align}
\int_0^1 p(x)\,\mathrm{d}x &= \int_0^1 \sum_{\tau=-\infty}^\infty 2^{-\tau+x-1} e^{-2^{-\tau+x}}\,\mathrm{d}x \\
&= \sum_{\tau=-\infty}^\infty \int_0^1 2^{-\tau+x-1} e^{-2^{-\tau+x}}\,\mathrm{d}x \\
&= \sum_{\tau=-\infty}^\infty \frac{e^{2^{-\tau}}-e^{-2^{-(\tau-1)}}}{2\log (2)} \\
&= \lim_{M,N\to\infty} \sum_{\tau=-N}^M \frac{e^{2^{-\tau}}-e^{-2^{-(\tau-1)}}}{2\log (2)} \\
&= \frac{1}{2\log(2)} \left[ \lim_{M\to\infty} e^{-2^{-M}}-\lim_{N\to\infty}e^{-2^{N+1}} \right]\\
&= \frac{1}{2\log(2)}
\end{align}The bi-infinite sum is actually a telescoping series, which is pretty nice!
In conclusion, we showed that the limiting function is periodic (peak-to-peak is about $1.4\times 10^{-5}$), it is not a true sinusoid (but very close), and its mean value is exactly $\frac{1}{2\log(2)}$.
Rabbits all the way down
Our previous expression for the limiting distribution was:
\[
p(x) = \sum_{\tau=-\infty}^\infty 2^{-\tau+x-1} e^{-2^{-\tau+x}}\quad\text{for }x\in[0,1]
\]This is a periodic function, so why not find its Fourier series? To refresh your memory, since $p(x)$ is periodic with period $1$, we can write
\[
p(x) = \sum_{n=-\infty}^\infty c_n e^{2\pi i n x},\quad\text{where:}\quad
c_n = \int_0^1 p(x) e^{-2\pi i n x}\,\mathrm{d}x
\]Amazingly, we can evaluate $c_n$ in (somewhat) closed form:
\begin{align}
c_n &= \int_0^1 p(x) e^{-2\pi i n x} \\
&= \int_0^1 \sum_{\tau=-\infty}^\infty 2^{-\tau+x-1} e^{-2^{-\tau+x}} e^{-2\pi i n x}\,\mathrm{d}x \\
&= \sum_{\tau=-\infty}^\infty \int_0^1 2^{-\tau+x-1} e^{-2^{-\tau+x}} e^{-2\pi i n x}\,\mathrm{d}x
\end{align}Substitute $u = 2^{-\tau+x}$, so $\mathrm{d}u = u\log(2) \mathrm{d}x$, and $x=\tau+\log_2(u)$:
\begin{align}
c_n &= \sum_{\tau=-\infty}^\infty \frac{1}{2\log(2)}\int_{2^{-\tau}}^{2^{-\tau+1}} e^{-u}e^{-2\pi in(\tau+\log_2(u))}\,\mathrm{d}u \\
&= \frac{1}{2\log(2)} \sum_{\tau=-\infty}^\infty \int_{2^{-\tau}}^{2^{-\tau+1}} e^{-u}u^{-2\pi in/\log(2)}\,\mathrm{d}u \\
&= \frac{1}{2\log(2)} \int_{0}^{\infty} e^{-u}u^{-2\pi i n/\log(2)}\,\mathrm{d}u \\
&= \frac{\Gamma\bigl(1-\tfrac{2\pi i n}{\log(2)}\bigr)}{2\log(2)},
\end{align}where $\Gamma(z)$ is the gamma function. Therefore, the Fourier series of $p(x)$ is
\[
p(x) = \sum_{n=-\infty}^\infty \frac{\Gamma\bigl(1-\tfrac{2\pi i n}{\log(2)}\bigr)}{2\log(2)} e^{2\pi i n x}
\]Define $\alpha_n := \Gamma\bigl(1-\tfrac{2\pi i n}{\log(2)}\bigr) = |\alpha_n|e^{i \arg(\alpha_n)}$, and we can write the series in a more familiar way using standard trig functions:
\[
p(x) = \frac{1}{2\log(2)} + \frac{1}{\log(2)}\sum_{n=1}^\infty |\alpha_n| \cos\bigl( 2\pi n x+\arg(\alpha_n) \bigr)
\]There is actually a closed-form expression for the magnitude of the gamma function when the real part of the argument is an integer (see here). Specifically, we have $|\Gamma(1+ib)|^2 = \frac{\pi b}{\sinh(\pi b)}$ for all $b\in \mathbb{R}$. Therefore, we can write the Fourier series for $p(x)$ as:
\[
p(x) = \frac{1}{2\log(2)} + \sum_{n=1}^\infty \sqrt{ \frac{2 \pi^2 n \,\text{csch}\bigl(\tfrac{2 \pi ^2 n}{\log (2)}\bigr)}{\log^{3}(2)}
} \cos\bigl( 2\pi n x + \arg(\alpha_n) \bigr)
\]where $\text{csch}(z) := \frac{2}{e^z-e^{-z}}$ is the hyperbolic cosecant. The Fourier coefficients $|\alpha_n|$ decrease very rapidly with increasing $n$. In fact, we have:
\begin{align}
c_n &= \sqrt{ \frac{2 \pi^2 n \,\text{csch}\bigl(\tfrac{2 \pi ^2 n}{\log (2)}\bigr)}{\log^{3}(2)}} \\
c_1 &\approx 7.1301\times 10^{-6} \\
c_2 &\approx 6.60339\times 10^{-12} \\
c_3 &\approx 5.29624\times 10^{-18} \\
\end{align}As we increase $n$, we have $c_n \sim \exp\bigl(-\tfrac{n \pi^2}{\log(2)}\bigr) \approx 10^{-6.184 n}$. This is why the amplitude of $p(x)$ is so small, and why it looks so close to being a sinusoid when we plot it.
Using this information, we can return to the original problem and write out an exact expression for the steady-state function in terms of the original $n$ (number of coins). This yields the expression:
\[
p_n \to \frac{1}{2\log(2)} + \sum_{k=1}^\infty \sqrt{ \frac{2 \pi^2 k \,\text{csch}\bigl(\tfrac{2 \pi ^2 k}{\log (2)}\bigr)}{\log^{3}(2)}
} \cos\Bigl( 2\pi k \log_2(n) + \arg \Gamma\bigl(1-\tfrac{2\pi i k}{\log(2)}\bigr) \Bigr)
\]
The first few terms of the series are:
\begin{multline}
p_n \approx 0.72134752044448170368 \\
+ \left(7.130117\times 10^{-6}\right) \cos\bigl( 2\pi \log_2(n) +0.872711\bigr) \\
+ \left(6.603386\times 10^{-12}\right) \cos\bigl( 4\pi \log_2(n) +2.51702\bigr) + \dots
\end{multline}
Finally, here is the numerical data overlaid with the first harmonic approximation (only keeping the first term of the sum). Looks pretty good!

Equipped with this series, we can now evaluate the probability of finding a “luckiest coin” when there are $n=10^6$ coins. Here is a very accurate approximation (all digits significant)
\[
p_{1,000,000} \approx 0.7213539630726652117823954768
\]