This week’s Riddler Classic is a paradoxical question about cutting a ruler into smaller pieces.
Recently, there was an issue with the production of foot-long rulers. It seems that each ruler was accidentally sliced at three random points along the ruler, resulting in four pieces. Looking on the bright side, that means there are now four times as many rulers — they just happen to have different lengths. On average, how long are the pieces that contain the 6-inch mark?
With four cuts, each piece will be on average 3 inches long, but that can’t be the answer, can it?
Here is my solution:
[Show Solution]
We will consider the following more general version of the problem.
Suppose a ruler of length $L$ is marked at a fraction “$a$” from one end of the ruler. Now suppose $N-1$ cuts are chosen uniformly at random along the length of the ruler, which splits the ruler into $N$ smaller pieces. What is the expected length of the piece that contains the mark?
In the original problem, $L=12\text{ inches}$, $a=\tfrac{1}{2}$, and $N=4$.
We will start by considering a simpler problem. Suppose we have $k$ numbers chosen uniformly at random and independently on the interval $[0,b]$. What is the expected value of $\min_{1\leq i \leq k} x_i$? We can compute this using the fact that:
\begin{align}
\mathbb{E}\left( \min_{1\leq i \leq k} x_i \right) &= \int_0^b \mathbf{Prob}( \min(x_i) \geq t )\,\mathrm{d}t \\
&= \int_0^b \mathbf{Prob}( x_1 \geq t, \dots, x_k \geq t )\,\mathrm{d}t \\
&= \int_0^b \mathbf{Prob}( x_1 \geq t) \cdots \mathbf{Prob}( x_k \geq t )\,\mathrm{d}t \\
&= \int_0^b \mathbf{Prob}( x_1 \geq t)^k\,\mathrm{d}t \\
&= \int_0^b \left( \frac{b-t}{b}\right)^k \mathrm{d}t \\
&= \frac{b}{k+1}
\end{align}The first equality follows from the fact that expectation can be written as the integral of the complementary cumulative distribution function (wiki link). Then, we use the fact that the $x_i$ are mutually independent and identically distributed.
Back to our original problem, we can split it into cases depending on how many cuts end up to the left vs how many end up to the right of our mark. Suppose we have $k$ cuts to the left of $a$, and the remaining $N-k-1$ cuts to the right. Then the expected length of the piece containing $a$ is the sum of the expected lengths of the “left part” and “right part”. Each of these pieces can be computed based on the preliminary result above, and we obtain:
\[
L\left(\frac{a}{k+1}+\frac{1-a}{N-k}\right)
\]Note that this formula gives the correct answer when $k=0$; it returns a length of $a$ for the left half (the whole interval). Next, we must compute the probability of this actually occurring; that exactly $k$ cuts are on the left and $N-k-1$ cuts are on the right. Since the probability of these events happening are $a$ and $1-a$, respectively, and they are mutually independent, we have a binomial distribution. The probability that $k$ cuts are to the left of $a$ is therefore:
\[
\binom{N-1}{k}a^k (1-a)^{N-k}
\]Combining these two facts, the expectation we seek to evaluate can be written as the sum:
\[
\sum_{k=0}^{N-1} L\left(\frac{a}{k+1}+\frac{1-a}{N-k}\right) \binom{N-1}{k}a^k (1-a)^{N-k-1}
\]We can split this into two separate sums. For the first one,
\begin{align}
&\sum_{k=0}^{N-1} \frac{L a}{k+1}\binom{N-1}{k}a^k (1-a)^{N-k-1}\\
&= \sum_{k=0}^{N-1} \frac{L}{N}\binom{N}{k+1}a^{k+1} (1-a)^{N-k-1} \\
&= \sum_{k=1}^{N} \frac{L}{N}\binom{N}{k}a^{k} (1-a)^{N-k} \\
&= \frac{L}{N}\left( 1-(1-a)^N \right)
\end{align}where we used the binomial theorem in the final step to evaluate the sum. Using a similar argument for the other half of the sum, we obtain $\frac{L}{N}\left( 1-a^N\right)$. Combining both parts, we find our final formula for the expected length of the piece containing the mark:
$\displaystyle
\frac{L}{N}\Bigl( 2-a^N-(1-a)^N \Bigr)
$
In particular, for the original problem statement, $L=12\text{ inches}$, $a=\tfrac{1}{2}$, and $N=4$. So the final answer is $\frac{45}{8}= 5.625\text{ inches}$. This is substantially longer than the average length of each piece, which is 3 inches.
Several interpretations of this fact were brought to my attention, so I will relay a couple here.
Commenter Guy D. Moore noted that this is an example of selection bias. My colleague Kangwook Lee also pointed out that this phenomenon is precisely the inspection paradox from probability theory.
Here is an intuitive explanation for why the true answer is so much larger than 3. The longer pieces of ruler are more likely to contain the 6-inch mark. In fact, any piece longer than 6 inches is guaranteed to contain the 6-inch mark! Although such pieces are rare, we are conditioning on the fact that our piece contains the 6-inch mark, so we are likelier to pick out these longer pieces and so the average piece length will be longer.
Limiting cases
As a sanity check, let’s see what happens when we consider the limiting behavior of our formula.
- If $N=1$ (just one piece), the formula simplifies to $L$, which makes sense. In this case, no cuts are made, so all pieces contain the mark, and all pieces are of length $L$.
- If $a=0$ or $a=1$ (the mark is at either end of the ruler), the formula simplifies to $\frac{L}{N}$, which is the average piece length. This makes sense because the first piece always contains the mark, and there is no reason the first piece should be any longer or shorter than the average piece.
- If $N$ gets very large and $0\lt a\lt 1$, the formula tends to $\frac{2L}{N}$. So pieces containing the mark are on average twice as long as the average piece.
Here is a figure showing how much longer the marked piece of ruler is compared to the average length $\frac{L}{N}$. We can visually confirm the three limiting cases: when $N=1$, we just get the average length, if $a=0$ or $a=1$ we also get the average length, and as $N\to\infty$, we do indeed tend to $2$. The solution to the original problem ($a=\tfrac{1}{2}$ and $N=4$) produces a ratio of $1.875$, which when multiplied by the average length $\frac{L}{N} = \frac{12}{4} = 3$ produces the answer we reported above, $5.625$.
You are puzzled that for large N, the average length of the segment containing the point a fraction x along the ruler is 2L/N, even though the average segment must be length L/N.
This is selection bias. The average segment has length L/N. But the average point is on a segment of length 2L/N, because the longer segments have more points, and the shorter segments have fewer points. When you ask about the length of a segment containing a particular point, that weights the average over segments according to the length of the segment, so the longer segments contribute more to the average and the shorter segments contribute less. (If every family either has 0 or 4 children, then the average family has 2 children, but the average child is one of 4 siblings, not 2.)
In the limit of large N, the length distribution of segments is exponential; the probability density that a given segment has length x is (N/L)exp(-Nx/L). The average length is the integral of this times x. The average length, weighted according not to the number of segments but according to the length of the segment, is the integral of this times x^2. And the integral of x^2 exp(-x) is 2.
Laurent,
Yours is the simplest solution! So easy to follow!
In the equation following the paragraph “Back to our original problem…”, shouldn’t the second denominator be “N-k”? If “N-k-1” cuts are to the right, and the preliminary result suggests that “cuts +1” is the denominator for our expected value, then “N-k-1” plus 1 should give “N-k”. You arrived at the correct results in the end, so I feel like I am missing something and the equation is correct as is. If that is the case, then kindly disregard.
Thanks for the great write up!
Fixed the typo. Thanks!
You can transform this problem by, before you start, cutting the ruler at the 6 inch mark and gluing the two ends together. This splits up “the piece containing the 6 inch mark” into the first piece and the last piece. The question is therefore equivalent to: “What is the expected length of the first piece + the last piece, given that there is a cut at 6 inches”. Or, more generally, “What is the expected length of the first piece + the last piece given that there is a cut at the ‘a’ inch mark”. With this framing, it’s intuitive that as the number of cuts gets large, this expected length approaches 2L/N.
Hi Laurent, I think the solution presented above is not right
I believe the solution is L*Sum i=0,1..N{p_i * l_i}
p_i – probability of i cuts being left of a = Combin(N,i)*a^i*(1-a)^(N-i)
l_i – average length of piece containing a given i cuts left of a = (a/(i+1)! + (1-a)/(N+1-i)!)
P.S how do you write mathematical formulas on this site?
Actually you’re solution appears to be correct (I’m not sure where I’ve gone wrong).