Elf music

This holiday-themed Riddler problem is about probability:

In Santa’s workshop, elves make toys during a shift each day. On the overhead radio, Christmas music plays, with a program randomly selecting songs from a large playlist.

During any given shift, the elves hear 100 songs. A cranky elf named Cranky has taken to throwing snowballs at everyone if he hears the same song twice. This has happened during about half of the shifts. One day, a mathematically inclined elf named Mathy tires of Cranky’s sodden outbursts. So Mathy decides to use what he knows to figure out how large Santa’s playlist actually is.

Help Mathy out: How large is Santa’s playlist?

Here is my solution:
[Show Solution]

Hand sort

A card-rearranging problem on the Riddler blog. Here it goes:

You play so many card games that you’ve developed a very specific organizational obsession. When you’re dealt your hand, you want to organize it such that the cards of a given suit are grouped together and, if possible, such that no suited groups of the same color are adjacent. (Numbers don’t matter to you.) Moreover, when you receive your randomly ordered hand, you want to achieve this organization with a single motion, moving only one adjacent block of cards to some other position in your hand, maintaining the original order of that block and other cards, except for that one move.

Suppose you’re playing pitch, in which a hand has six cards. What are the odds that you can accomplish your obsessive goal? What about for another game, where a hand has N cards, somewhere between 1 and 13?

Here is my solution:
[Show Solution]

Sniff out the spies

This interesting problem appeared on the Riddler blog. Here it goes:

There are N agents and K of them are spies. Your job is to identify all the spies. You can send a given number of agents to a “retreat” on a remote island. If all K spies are present at the retreat, they will meet to strategize. If even one spy is missing, this spy meeting will not take place. The only information you get from a retreat is whether or not the spy meeting happened. You can send as many agents as you like to the retreat, and the retreat can happen as many times as needed. You know the values of N and K.

What’s the minimum number of retreats needed to guarantee you can identify all K spies? If each retreat costs \$1,000 per person, what is the total cost to identify all K spies?

To begin with, let’s assume you know that N = 1,024 and K = 17.

Here is my solution for $K=1$:
[Show Solution]

And here is a partial solution for $K \gt 1$:
[Show Solution]

Nightmare Solitaire

This Riddler problem is a probability question about an old favorite: Solitaire!

While killing some time at your desk one afternoon, you fire up a new game of Solitaire (also called Klondike, and specifically the version where you deal out three cards from the deck at a time). But your boredom quickly turns to rage because your game is unplayable — you can flip through your deck, but you never have any legal moves! What are the odds?

Here is my solution:
[Show Solution]

Where will the seven dwarfs sleep tonight?

The following problem appeared in The Riddler. It’s an interesting recursive problem.

Each of the seven dwarfs sleeps in his own bed in a shared dormitory. Every night, they retire to bed one at a time, always in the same sequential order, with the youngest dwarf retiring first and the oldest retiring last. On a particular evening, the youngest dwarf is in a jolly mood. He decides not to go to his own bed but rather to choose one at random from among the other six beds. As each of the other dwarfs retires, he chooses his own bed if it is not occupied, and otherwise chooses another unoccupied bed at random.

  1. What is the probability that the oldest dwarf sleeps in his own bed?
  2. What is the expected number of dwarfs who do not sleep in their own beds?

Here is my solution.
[Show Solution]

Cracking the safe

The following problem appeared in The Riddler and it’s about finding the right code sequence to crack open a safe.

A safe has three locks, each of which is unlocked by a card, like a hotel room door. Each lock (call them 1, 2 and 3) and can be opened using one of three key cards (A, B or C). To open the safe, each of the cards must be inserted into a lock slot and then someone must press a button labeled “Attempt To Open.”

The locks function independently. If the correct key card is inserted into a lock when the button is pressed, that lock will change state — going from locked to unlocked or unlocked to locked. If an incorrect key card is inserted in a lock when the attempt button is pressed, nothing happens — that lock will either remain locked or remain unlocked. The safe will open when all three locks are unlocked. Other than the safe opening, there is no way to know whether one, two or all three of the locks are locked.

Your job as master safecracker is to open the locked safe as efficiently as possible. What is the minimum number of button-press attempts that will guarantee that the safe opens, and what sequence of attempts should you use?

Here is my solution.
[Show Solution]

The lucky derby

In the spirit of the Kentucky Derby, this Riddler puzzle is about a peculiar type of horse race.

The bugle sounds, and 20 horses make their way to the starting gate for the first annual Lucky Derby. These horses, all trained at the mysterious Riddler Stables, are special. Each second, every Riddler-trained horse takes one step. Each step is exactly one meter long. But what these horses exhibit in precision, they lack in sense of direction. Most of the time, their steps are forward (toward the finish line) but the rest of the time they are backward (away from the finish line). As an avid fan of the Lucky Derby, you’ve done exhaustive research on these 20 competitors. You know that Horse One goes forward 52 percent of the time, Horse Two 54 percent of the time, Horse Three 56 percent, and so on, up to the favorite filly, Horse Twenty, who steps forward 90 percent of the time. The horses’ steps are taken independently of one another, and the finish line is 200 meters from the starting gate.

Handicap this race and place your bets! In other words, what are the odds (a percentage is fine) that each horse wins?

Here is my full derivation (long!):
[Show Solution]

For the tl;dr, the answer is:
[Show Solution]

Colorful balls puzzle

This Riddler puzzle about an interesting game involving picking colored balls out of a box. How long will the game last?

You play a game with four balls: One ball is red, one is blue, one is green and one is yellow. They are placed in a box. You draw a ball out of the box at random and note its color. Without replacing the first ball, you draw a second ball and then paint it to match the color of the first. Replace both balls, and repeat the process. The game ends when all four balls have become the same color. What is the expected number of turns to finish the game?

Extra credit: What if there are more balls and more colors?

Here is my solution to the first part (four balls):
[Show Solution]

Here is my solution to the general case with $N$ balls:
[Show Solution]

Pick a card!

This Riddler puzzle is about a card game where the goal is to find the largest card.

From a shuffled deck of 100 cards that are numbered 1 to 100, you are dealt 10 cards face down. You turn the cards over one by one. After each card, you must decide whether to end the game. If you end the game on the highest card in the hand you were dealt, you win; otherwise, you lose.

What is the strategy that optimizes your chances of winning? How does the strategy change as the sizes of the deck and the hand are changed?

Here is my solution:
[Show Solution]

Lucky spins in roulette

Roulette is a game where the dealer spins a numbered wheel containing a marble. Once the wheel settles, the marble lands on exactly one (winning) number. A standard American roulette wheel has 38 numbers on it. Players may place bets on a particular number, a color (red or black), or a particular quadrant of the wheel.

I was recently in Las Vegas for a conference, and I noticed that the roulette tables in the conference hotel were equipped with screens that display real-time statistics! A computer keeps track of the winning numbers for the past 300 spins and shows how frequently each number was a winner (see photo on the right). The two “elite numbers” were 12 and 27 (each occurred 13 times in the past 300 games) and the “poor performer” was 0, which only occurred 2 times.

Even if the roulette wheel is equally likely to land on each number, there will be natural fluctuations. Some numbers will obviously win more frequently than others in any finite number of spins. But what sorts of deviations can we expect? In the picture above, the spread between most frequent winner and least frequent winner was 13 to 2 over the course of 300 games. Is this normal? In other words, how big would the spread have to be before you suspected that the roulette wheel was rigged and was favoring some numbers over others? Let’s find out!

Results via simulation

Let’s suppose the wheel has $n$ different numbers on it, which we label $\{1,2,\dots,n\}$, and we spin it $m$ times when we aggregate our statistics. Note that in the actual roulette game, $n=38$ and $m=300$. If the game is fair, then each number is equally likely to occur and the spins are mutually independent. In other words, rolling a 4 on your first spin doesn’t make you more or less likely to roll a 4 or any other number on subsequent spins.

This sort of game is straightforward to simulate in Matlab. I wrote code that plays 300 games, counts how many times the most and least frequent numbers occur, and then repeats the entire process a large number of times. Then, I plotted a histogram showing what proportion of the time the most frequent number occurred 10 times, 11 times, etc. This is known as the empirical distribution. Here are the results of the simulation (aggregated over 100 million trials).

As we can see, a spread from 2 to 13 is right around what we expect! Of course, outliers are still possible. For example, it’s possible for the luckiest number to win 30 times out of 300 spins, but this only happened twice out of 100,000,000 simulated runs.

Analytical results

The roulette wheel lands on one of the $n$ numbers with equal probability. We then perform $m$ spins and count how many times each number occurs. If we place the counts in the vector $(x_1,\dots,x_n)$, where $x_1+\dots+x_n=m$, then this vector follows a multinomial distribution. The probability mass function is therefore:
\[
p(x_1,\dots,x_n) = {m \choose {x_1,\dots,x_m}} \frac{1}{n^m}
\]The bracketed term is called a multinomial coefficient and it counts the number of ways we can spin a “$1$” $x_1$ times, a “$2$” $x_2$ times, and so on. This also corresponds to the number of ways of depositing $m$ distinct objects into $n$ distinct bins, with $x_1$ objects in the first bin, $x_2$ objects in the second bin, and so on. It’s also equal to $\frac{m!}{x_1! \cdot x_2!\cdots x_n!}$. Meanwhile, $n^m$ is the number of different ways we can spin the roulette wheel $m$ times.

The probability that the most frequent spin occurs at most $k$ times is:
\[
F_\text{max}(k) = \sum_{\substack{ x_1+\dots+x_n=m\\x_i\,\le\,k}} p(x_1,\dots,x_n)
\]In other words, we must sum over all tuples $(x_1,\dots,x_n)$ of nonnegative integers that sum to $m$, that are each no greater than $k$. This is the cumulative distribution function, and in order to obtain the probability that the most frequent spin occurs exactly $k$ times (the probability mass function), we compute the difference:
\[
q_\text{max}(k) = F_\text{max}(k)-F_\text{max}(k-1)
\]A similar expression can be obtained for the probability that the least frequent spin occurs exactly $k$ times. These sums are impractical to compute because they have an astronomical number of terms. We must resort to approximation!

In our multinomial distribution, each $x_i$ has mean $\frac{m}{n}$ and variance $\frac{m}{n}\left(1-\frac{1}{n}\right)$. We will make the approximation that the $x_i$ are independent binomial random variables. Of course the $x_i$ are not truly independent since they must sum to $m$, but $n$ is large enough that this is a good approximation. We can therefore compute the distribution of the maximum by using the cumulative distribution:
\begin{align}
F_\text{max}(k) &= \mathbb{P}(x_1\le k,\dots,x_n\le k) \\
&\approx \prod_{i=1}^n \mathbb{P}(x_i \le k) \\
&= \left( \sum_{j=0}^k { m \choose j } \left( \frac{1}{n} \right)^j \left( 1-\frac{1}{n}\right)^{m-j} \right)^n
\end{align}This sum is far more manageable and we can compute it exactly without any difficulty. Once we have this cumulative distribution, we compute our approximation to $q_\text{max}(k)$ as before, by taking the difference of successive cumulative sums. The result is shown in the plot below.

As we can see, this plot is very close to the one we obtained via simulation. So the binomial approximation is quite good! It’s tempting to approximate the binomial distribution even further as a normal distribution, but it turns out this approximation isn’t very good. A rule of thumb to see whether we should use a normal distribution is to check whether 3 standard deviations of the mean keeps us within the range of admissible values, i.e. $\{0,1,\dots,m\}$. This amounts to checking, in our case, that $m > 9(n-1)$. Since $n=38$ and $m=300$, the inequality doesn’t hold because the right-hand side equals $333$.

Conclusion

Seeing one number occur 13 times while another only occurs 2 times in the past 300 spins isn’t abnormal at all. In fact, it’s very much what you should expect! So don’t think of the real-time statistics as a means to help you predict “lucky” or “unlucky” numbers… think of it as a way to double-check that the roulette wheel is unbiased!