This week’s Riddler classic is a tough one! Here is a paraphrased version of the problem.
Imagine $n^2$ bowling pins arranged in a rhombus. The image to the right illustrates the case $n=3$. We knock down the topmost pin. When any pin gets knocked down, each of the (up to two) pins directly behind it has a probability $p$ of being knocked over (independently of each other). We are interested in the probability that the bottommost pin gets knocked down in the limit of large $n$.
The original problem specifically asked about $p=0.5$ and $p=0.7$.
My solution:
[Show Solution]
Let $F(n,p)$ be the probability that the bottommost pin gets knocked over. The problem asks about $\lim_{n\to\infty} F(n,0.5)$ and $\lim_{n\to\infty} F(n,0.7)$.
Percolation theory
This problem is closely connected to a topic in statistical physics called percolation theory. Statistical physics is concerned with how microscopic properties (e.g., how atoms interact with their neighbors) can lead to macroscopic properties, such as temperature, phase (liquid, solid, gas), and more. We can reframe the problem in statistical physics terms by thinking of the bowling pins as atoms in a porous material. At the microscopic scale, water can squeeze through the space between molecules with probability $p$. Beneath each space, there are two spaces in the next layer. Then, we can ask whether water can percolate through the material or not. In other words, we are asking the macroscopic question: Is the material waterproof?
To get a feel for this, let’s run some simulations. We’ll start with $p=0.5$ and $n=20$. Here, the topmost pin is in the top left, and the bottommost pin is in the bottom right. We can see that the pins never really get rolling because the probability $p$ is too small. In this case, there is no percolation.
Now let’s increase the probability to $p=0.7$. For these plots, I zoomed out and used $n=100$. We can see that in most of the cases, the floodgates open and a large number of pins get knocked down (percolation!). But it doesn’t happen every time. Some cases still run into bad luck and get stopped before the chain reaction can develop.
In percolation theory, the scenario we’re studying is called “directed bond percolation on a 2D square lattice”. It is directed because the pins can only knock other pins over in one direction (downwards), and it is bond percolation because each link (bond) in the lattice is active with probability $p$, as opposed to site percolation where all links are active, but you make a particular pin active with probability $p$.
A fundamental result in percolation theory is that there is a critical probability $p_c$ at which percolation begins to happen. In other words, the probability of percolation $\theta(p)$ satisfies:
\[
\theta(p) \begin{cases}
=0 & \text{for }0 \leq p \lt p_c \\
\gt 0 & \text{for }p_c \lt p \leq 1
\end{cases}
\]The exact value of $p_c$ is only known for certain special cases. For example, it was proved in the following seminal paper that if you have an infinite square lattice and allow pins to knock each other down in all 4 directions, the critical probability is exactly $1/2$:
on the Square Lattice Equals 1/2”. Commun. Math. Phys. 74, 41-59 (1980). A pdf of this manuscript is freely available here.
The case of directed bond percolation on a square lattice (which is our case of interest) is still an open problem. We know some bounds, however. For example, it was derived in various papers that
\[
0.5176 \leq p_c \leq 0.6298
\]Why is this relevant? Well the problem asks about the cases $p=0.5$ and $p=0.7$, which lie conveniently outside these bounds! Therefore,
- When $p=0.5$, the bound above tells us that $p < p_c$. Therefore, $\theta(0.5)=0$ and we have no percolation. So, the probability of knocking over the bottommost pin when $n$ is large is zero.
- When $p=0.7$, the bound above tells us that $p > p_c$. Therefore, $\theta(0.7) > 0$ and we have percolation. So, there is a nonzero probability that we will knock over the bottommost pin when $n$ is large.
An analytic lower bound
Given that finding the exact value of $p_c$ is still an open problem, it seems hopeless to look for an analytic solution to the original question involving the bottommost pin. However, we can use an analytic approach to get a bound that allows us to confirm the result for $p=0.5$.
Let’s count the total number of paths that connect the topmost pin to the bottommost pin. Each path consists of $2(n-1)$ steps, and exactly half of the steps are “left” and the other half are “right”. They can be in any order, so the total number of distinct paths is $\binom{2n-2}{n-1}$. Each path also has a probability $p^{2n-2}$ of occurring, since each link has a probability $p$ of being active. Therefore,
\begin{align}
&\mathrm{Prob}(\text{at least one active path})\\
&= 1-\mathrm{Prob}(\text{no active path}) \\
&= 1-\mathrm{Prob}\left(\text{path }k\text{ is inactive for }k=1,\dots,\textstyle\binom{2n-2}{n-1}\right) \\
&\leq 1-\prod_{k=1}^{\binom{2n-2}{n-1}} \mathrm{Prob}(\text{path }k\text{ is inactive}) \\
&=1-\left(1-p^{2n-2}\right)^{\binom{2n-2}{n-1}}
\end{align}Here, the inequality follows from the fact that the paths are not mutually independent (they may have edges in common), and the lower bound comes from assuming independent paths. As an example, suppose there are two paths, and $A$ and $B$ are the events that the paths are inactive, respectively. Then, the probability that $A$ is inactive increases if we know that $B$ is inactive (because the paths may have edges in common). This means that:
\[
P(A\cap B) = P(A\mid B)P(B) \geq P(A)P(B)
\]Note that we have equality if $A$ and $B$ are independent (the paths share no edge in common). Consequently, we now have an analytic upper bound on the probability we care about:
$\displaystyle
F(n,p) \leq 1-\left(1-p^{2n-2}\right)^{\binom{2n-2}{n-1}}
$
Let $L$ be the limit as $n\to\infty$. We can evaluate it as follows:
\begin{align}
L &= 1-\lim_{n\to\infty} \left(1-p^{2n-2}\right)^{\binom{2n-2}{n-1}} \\
&= 1-\lim_{m\to\infty} \left(1-p^{2m}\right)^{\binom{2m}{m}} \\
&= 1-\exp\,\lim_{m\to\infty} \binom{2m}{m} \log \left(1-p^{2m}\right) \\
&= 1-\exp\,\lim_{m\to\infty} \frac{2^{2m}}{\sqrt{m\pi}} \log \left(1-p^{2m}\right) \\
&= 1-\exp\,\lim_{m\to\infty} \frac{4^{m+1} \sqrt{m}\, p^{2 m} \log (p)}{\sqrt{\pi } \left(1-p^{2 m}\right) (m \log (16)-1)} \\
&= 1-\exp\,\lim_{m\to\infty} \frac{4 \cdot (2p)^{2m} \log (p)}{ \log (16)\sqrt{m\pi} } \\
&= 1-\exp\begin{cases}
0 & \text{for }0 \leq p \leq \frac{1}{2}\\
-\infty & \text{for }\frac{1}{2} < p \leq 1
\end{cases}\\
&= \begin{cases}
0 & \text{for }0 \leq p \leq \frac{1}{2}\\
1 & \text{for }\frac{1}{2} < p \leq 1
\end{cases}
\end{align}
In the fourth step, we used an asymptotic approximation for the binomial coefficient. In the fifth step, we used L’Hôpital’s rule. Putting everything together, the upper bound on our probability of interest in the limit of large $n$ allows us to conclude that
$\displaystyle
\lim_{n\to\infty} F(n,p) = 0
\quad\text{for } 0 \leq p \leq \frac{1}{2}.
$
So we can answer part of the problem analytically: when $p\leq 0.5$, we can never knock over the bottommost pin in the limit of very large $n$.
Simulation results
To precisely pinpoint the probability of interest in the case $p>0.5$ requires more work, and here we turn to simulation. The first simulation I produced is intended to illustrate the behavior of $F(n,p)$ as $n$ grows.
Here, we can see that for $p \leq 0.5$, the probability of knocking over the bottommost pin clearly tends to zero, as we predicted. Moreover, for $p \geq 0.7$, the probability appears to tend to a constant value, again as predicted. The intermediate cases ($p=0.55, 0.60, 0.65$) are ambiguous, and it isn’t clear whether the limit is zero or if it tends to a positive constant. Note: The slight wiggles in the plot lines are due to the inherent randomness in the simulations. Each point is an average of $50{,}000$ simulations.
The second simulation I ran looked at the limiting probability only, by computing $F(n,p)$ for a large value of $n$. Here, I used $n=300$ and $10{,}000$ simulations per point. Here, we can clearly see the phase transition, which occurs near $p=0.6$. It is difficult to estimate exactly where the transition occurs because this is similar to estimating $p_c$. The closer we get to the critical probability, the longer our typical paths become. So it is not clear which paths are percolating because we’re not using a sufficiently large $n$ to see the true behavior.
Based on the plot, we can see that $\lim_{n\to\infty} F(n,0.7) \approx 0.58$. In order to get more accuracy, I computed how many simulations I would need to run in order to get more accuracy. Using the confidence interval for a Bernoulli process, we can show that an error of $\pm 0.001$ with $95\%$ confidence requires about $1{,}000{,}000$ simulations. This led me to the following slightly more accurate result.
$\displaystyle
\lim_{n\to\infty} F(n,0.7) \approx F(300,0.7) \approx 0.5782 \pm 0.001.
$
This is very nice.
I also did some Monte-Carlo study of the p=0.7 case for N=2,3,…,99.
Strangely, I observe that, as N increases, the probability to succeed falls, until about N=24, and then slightly increases. The minimum is about 0.5724 and then it rises back up to about 0.5786. This surprised me and I can’t explain it.
Yes, I noticed the same thing; couldn’t explain it either. Emily Boyajian also pointed this out in her solution. She also investigated some scenarios I didn’t consider, such as plotting the distribution of the number of pins knocked down. This might provide a clue as to what’s going on.
This reminds me of a university project where I was simulating the percolation threshold for site percolation on a square lattice (forest fire simulation). I remember that it was possible to estimate the threshold from a power law relationship as a function of n. Do you know if it’s possible to do the same for this type of lattice?