This week’s Riddler Classic is about Lotería, also known as Mexican bingo!

A thousand people are playing Lotería, also known as Mexican bingo. The game consists of a deck of 54 cards, each with a unique picture. Each player has a board with 16 of the 54 pictures, arranged in a 4-by-4 grid. The boards are randomly generated, such that each board has 16 distinct pictures that are equally likely to be any of the 54.

During the game, one card from the deck is drawn at a time, and anyone whose board includes that card’s picture marks it on their board. A player wins by marking four pictures that form one of four patterns, as exemplified below: any entire row, any entire column, the four corners of the grid and any 2-by-2 square.

After the fourth card has been drawn, there are no winners. What is the probability that there will be exactly one winner when the fifth card is drawn?

My solution:

[Show Solution]

Suppose $m$ players are playing the game ($m=1000$) and there are $n$ distinct cards $(n=54)$. Let’s consider the game from the point of view of one player. Define the following events:

\begin{align}

A &: \left\{\text{No winning pattern in the first four cards}\right\} \\

B &: \left\{\text{No winning pattern in the first five cards}\right\}

\end{align}If we count all the possible winning patterns, we see that there are 18. Meanwhile, there are $\binom{n}{4}$ ways of picking four cards. The probability of no winning pattern is 1 minus the probability of a winning pattern:

\[

\mathbf{P}(A) = 1-\frac{18}{\binom{n}{4}}

= \frac{35137}{35139} \approx 0.999943

\]The approach is similar when there are five cards. Each of the 18 winning patterns uses 4 cards. The fifth card can be any of the remaining $n-4$ cards, so we have:

\[

\mathbf{P}(B) = 1-\frac{18(n-4)}{\binom{n}{5}}

= \frac{35129}{35139} \approx 0.999715

\]

Now let’s ask the question: what is the probability that there is a winning pattern using all five cards *given that there is no pattern in the first four cards*. We can calculate this using Bayes’ rule and the law of total probability:

\begin{align}

\mathbf{P}(\bar B \mid A)

&= 1-\mathbf{P}(B\mid A) \\

&= 1-\frac{\mathbf{P}(A\cap B)}{\mathbf{P}(A)}\\

&= 1-\frac{\mathbf{P}(B)}{\mathbf{P}(A)}\\

&= \frac{8}{35137} \approx 0.00022768

\end{align}where in the third step, we used the fact that $B \subseteq A$ and therefore $A\cap B = B$ (if there is no winning pattern in the five cards, then there cannot be a winning pattern in the first four).

The problem asks us to find the probability that if $m$ players play the game and nobody wins on the fourth card, exactly one person wins on the fifth card. Let’s call this probability $q$. Since each player is playing independently, there are $m$ different people that can win, with probability $\mathbf{P}(\bar B\mid A)$, and the other $m-1$ players must lose, each with probability $\mathbf{P}(B \mid A)$. So the probability that exactly one person wins on the fifth card is:

\[

q\,=\, m \cdot \mathbf{P}(\bar B\mid A) \cdot \mathbf{P}(B\mid A)^{m-1}

= m \cdot \left( 1-\frac{\mathbf{P}(B)}{\mathbf{P}(A)} \right)\cdot \left( \frac{\mathbf{P}(B)}{\mathbf{P}(A)} \right)^{m-1}

\]Substituting the numerical values found earlier for $\mathbf{P}(A)$ and $\mathbf{P}(B)$ together with $m=1000$, we obtain:

\begin{align}

q &= 1000\left(\frac{8}{35137}\right) \left( \frac{35129}{35137} \right)^{999} \\

&= \frac{8000\cdot 35129^{999}}{35137^{1000}} \approx 0.181356

\end{align}So the probability that exactly one person out of 1000 wins on the fifth card given that nobody won on the first four cards is about 18.136%.

### The general case

If we leave the expression for $q$ in terms of $n$ (number of distinct cards) and $m$ (number of players), we obtain:

\[

q = 1728m\cdot \frac{\left(n^4-6 n^3+11 n^2-6 n-2160\right)^{m-1}}{\left(n^4-6 n^3+11 n^2-6 n-432\right)^{m}}

\]If we fix $m=1000$ and vary $n$, we obtain the following graph:

When $n$ is small, it is very likely that many people will win. So the probability that *only one person wins* on the 5th card is low. But when $n$ is very large, it is very unlikely that anybody will ever win, so the probability that one person wins on the 5th card is low again. The peak occurs at $n=38$ (with probability 36.8%).

If instead we fix $n=54$ and vary $m$, we obtain the following graph:

The shape of the curve is similar. When $m$ is small, there are not many people, so it is not likely that anybody will win on the fifth card. But when $m$ is large, it is likely that many people will win, but unlikely that *only one person* will win. The peak occurs at $m=4392$ (with probability 36.8% again).

If we vary $m$ and $n$ together, we see that for each $n$, there is an ideal number of players $m$ that would maximize the probability (red dashed line in the plot below).

The red dashed curve has the equation $m = \frac{n^4-6 n^3+11 n^2-6 n-432}{1728}$.

### Numerical validation

To test whether the above formulas are correct, I ran a Monte Carlo simulation to estimate $\mathbf{P}(\bar B\mid A)$, which is the probability that a single player wins on the fifth card given that they did not win on the fourth card. I simulated a large number of games and only kept those that didn’t win after four cards. Then I computed the empirical fraction of the remaining cases that won on the fifth card. Here is my Python code.

# winning sets of indices win_idx = [ [0,1,2,3],[4,5,6,7],[8,9,10,11],[12,13,14,15], # four in a row [0,4,8,12],[1,5,9,13],[2,6,10,14],[3,7,11,15], # four in a column [0,1,4,5],[1,2,5,6],[2,3,6,7], # 2x2 in first row [4,5,8,9],[5,6,9,10],[6,7,10,11], # 2x2 in second row [8,9,12,13],[9,10,13,14],[10,11,14,15], # 2x2 in third row [0,3,12,15] # four corners ] # returns True if a bingo card is a winner def iswinning(matches): for idx in win_idx: if all([matches[i] for i in idx]): return True return False # compute the probability that 5th card wins # given that the 4th card did not win N = 10**7 count,total = 0,0 n = 54 # number of cards S = list(range(n)) # full set of numbers C = random.sample(S,16) # 16 chosen numbers for iter in range(N): F = random.sample(S,5) # five drawn numbers # bingo score card after 4 and 5 turns matches4 = [c in F[:-1] for c in C] matches5 = [c in F for c in C] if not iswinning(matches4): total += 1 if iswinning(matches5): count += 1 # empirical probability and 95% confidence interval phat = count/total conf = 1.96*np.sqrt(phat*(1-phat)/total) print(f'Analytical: {1728 / (-432 - 6*n + 11*n**2 - 6*n**3 + n**4):.5g}') print(f'Monte Carlo: {count}/{total} = {phat:.4g}, or [{phat-conf:.4g},{phat+conf:.4g}] (95% CI)')

The result of running this code is:

Analytical: 0.00022768 Monte Carlo: 2292/9999478 = 0.0002292, or [0.0002198,0.0002386] (95% CI)

Unfortunately, since we are trying to estimate such a small probability, a large number of samples are required to reach a narrow confidence interval. I used 10 million samples in the simulation above, which took about 4 minutes on my laptop.