This Riddler puzzle is about random chords of a circle and the regions they describe.
At RoboPizza™, pies are cut by robots. When making each cut, a robot will randomly (and independently) pick two points on a pizza’s circumference, and then cut along the chord connecting them. If you order a pizza and specify that you want the robot to make exactly three cuts, what is the expected number of pieces your pie will have?
Here is a simple solution, which was pointed out to me in a comment to my original post.
[Show Solution]
Let’s say the robot cuts $n$ random chords. Due to the way in which the chords are randomly chosen, two different chords will never begin or end at the exact same point along the circumference. Moreover, any time chords intersect inside the circle, that intersection will involve exactly two chords. We don’t need to worry about cases where three cuts perfectly intersect in a single point (i.e. the way you’d normally cut a pizza!), since that will never happen.
Instead of counting the number of pieces $S$, we can count the number of intersections instead! By Euler’s Formula, we have that
\[
(\text{# vertices } V)\, -\, (\text{# edges }E) + (\text{# faces }F) = 2
\]
We have to be a bit careful to apply this formula correctly. If we have $n$ chords and $p$ intersections, there will be $V = 2n+p$ vertices (count the ones on the circumference). Each internal intersection adds an edge, and we must count the arcs between cuts along the circumference as edges, so we have a total of $E = 3n+2p$. Finally, there will be $F = S+1$ faces, since the entire circle counts as a face in Euler’s formula. Substituting these into the formula $V-E+F=2$, we obtain: $S = n+p+1$. So instead of counting the number of pizza pieces, we’ll count the number of intersections and then add $n+1$.
To find the expected number of intersections, suppose we have $n$ chords and let $X_{ij} = 1$ if chords $i$ and $j$ intersect, and $0$ otherwise. We’ll let $C$ be the set of all pairs of chords $(i,j)$. Then the expected number of intersections is:
\begin{align}
\mathbb{E} \left[ \sum_{(i,j) \in C} X_{ij} \right]
= \sum_{(i,j) \in C} \mathbb{E}\bigl[X_{ij}\bigr]
= {n \choose 2} \mathbb{E}\bigl[X_{12}\bigr]
\end{align}
Although the $X_{ij}$ are not mutually independent, linearity of expectation holds regardless. By symmetry each $\mathbb{E}\bigl[X_{ij}\bigr]$ is the same, and is equal to the probability that two random chords intersect. We show in the next solution that this probability is $\tfrac{1}{3}$, so the expected number of pieces is $\tfrac{1}{3}{n \choose 2} + n + 1$, which is equal to:
$\displaystyle
\mathbb{E}\bigl[\text{number of pieces}\bigr] = \frac{(n+2)(n+3)}{6}
$
In the case where we have $n=3$ cuts, the expected number of pieces is $5$.
The following solution is a bit more complicated, and computes the entire distribution rather than just its expected value.
[Show Solution]
We can think of the cutting process as happening in two stages. First, we choose $2n$ points randomly along the circumference. Then, we assign labels to the points: $A_1$, $A_2$, $B_1$, $B_2$, and so on. Then, we draw chords from $A_1$ to $A_2$, $B_1$ to $B_2$, and so on. The key observation is that the placement of the points (the first step) is irrelevant. Only the label assignment is relevant to determining the number of intersections points.
For example, consider the case $n=2$. If we assign labels to points in a clockwise fashion as: $A_1, B_2, A_2, B_1$ to the four points, then there will always be one intersection point. Similarly, if we assign the labels in the order: $A_2,A_1,B_1,B_2$, there will always be zero intersections. It doesn’t matter where the four points are, the only thing that matters is the order of the labels!
Due to symmetry, the number of intersections doesn’t change if we swap $A_1$ and $A_2$, or even if we swap $A$ and $B$. So ultimately, we only need to count the number of different pairs of points that can be chosen!
Here is a diagram showing the breakdown for $n=2$. Keep in mind that the location of each point doesn’t change. We’re merely choosing how we pair the points up.
So there are $3$ possible ways to choose pairs, two of which have $3$ pieces and one of which has $1$ piece. So the expected number of pieces is
\[
S_2 = \left(\tfrac{2}{3}\times 3\right) + \left(\tfrac{1}{3} \times 4\right) = \tfrac{10}{3}
\]
The probability of the two chords intersecting is $\tfrac{1}{3}$ (we make use of this in the first solution to the problem). The $n=3$ case is a bit more complicated because there are more ways to choose pairs, but the idea is the same. Here is a diagram showing the breakdown.
The expected number of pieces is therefore:
\[
S_3 = \left(\tfrac{5}{15}\times 4\right) + \left(\tfrac{6}{15} \times 5\right) + \left(\tfrac{3}{15} \times 6\right) + \left(\tfrac{1}{15} \times 7\right) = 5
\]
The expected number of pieces is 5.
One final point of interest: the way in which the chords are randomly chosen is important! If we had used a different method, e.g. by picking a random radius and a random point on that radius, and then constructing a chord perpendicular to the radius through this chosen point, then we would obtain a different answer! The approach above works because the way the chords are chosen allows us to decouple how endpoints are chosen from how labels are assigned. To learn more about why the randomization method is important, take a look at Bertrand’s Paradox.
If you’ve already read the solution above and you’re interested in the distribution of pieces for the general case, read on!
[Show Solution]
This genre of problem involving chords of circles, intersections, regions, and so on, has quite a history. The specific problem of characterizing intersections was first posed and solved in the 1930’s, and since then, some elegant solutions have been found for the general case where there are $n$ chords. The solution I’ll summarize below is drawn mostly from the following 1975 paper:
The approach described in Riordan’s paper uses generating functions. Define $T_{nk}$ to be the number of ways of choosing $n$ pairs of points among $2n$ general points such that there are $k$ intersections, and let $T_n(x)$ be the polynomial with $T_{nk}$ as coefficients:
\[
T_n(x) = \sum_{k=0}^{K} T_{nk} x^k,\qquad\text{where }K = {n \choose 2}
\]
The reason the sum goes up to $K$ is that since each pair of chords can intersect at most once, $n$ chords will produce at most $n \choose 2$ intersections. The key result from the paper above is that
\[
T_n(x) = \frac{1}{(1-x)^n} \sum_{j=0}^n (-1)^j a_{n+j,n-j}\, x^{j(j+1)/2}
\]
Here, $a_{nm}$ is a ballot number, which counts the following thing: suppose there is an election where candidate $A$ has $n$ votes and candidate $B$ has $m$ votes, with $n \ge m$. Then $a_{nm}$ is the number of ways the election results can come in (one vote at a time) such that $A$ always has at least as many votes as $B$ in the cumulative count. Another interpretation is that $a_{mn}$ is the number of “North-East” lattice paths from $(0,0)$ to $(n,m)$ that do not cross the diagonal points $(i,i)$. A closed-form expression is:
\[
a_{n+j,n-j} = \frac{2j+1}{n+j+1}{2n \choose n+j} %=\frac{2j+1}{2n+1}{2n+1 \choose n-j}
\]
In the case where $j=0$, we recover the so-called Catalan numbers, which show up in a variety of combinatorial problems.
Using the above formula for $T_n(x)$, we can expand the right-hand side as a series and compare coefficients in order to obtain $T_{nk}$. Here are the result obtained for $T_n(x)$ for $n=1,\dots,4$:
\begin{align}
T_1(x) &= 1 \\
T_2(x) &= 2 + x \\
T_3(x) &= 5 + 6 x + 3 x^2 + x^3 \\
T_4(x) &= 14 + 28 x + 28 x^2 + 20 x^3 + 10 x^4 + 4 x^5 + x^6
\end{align}
Examining the coefficients, we see that the cases $n=2$ and $n=3$ match up with what we found in the first solution. To turn these counts into probabilities, simply divide each coefficient by $T_n(1)$, which is the total number of ways of choosing pairs of points. This sum turns out to have the closed-form expression $T_n(1) = \tfrac{n!}{2^n}{2n \choose n}$. The Catalan numbers show up as the constant coefficients $C_n = T_n(0)$. In other words, $C_n$ is the number of ways of assigning chords such that there are no intersections.
The expected number of pizza pieces also has a closed-form expression, which is $S_n = \frac{1}{6}(n+2)(n+3)$. Here are some plots of the distribution of pieces for different values of $n$.
In the limit as $n\to\infty$, the distribution approaches a Gaussian with mean $S_n$ and variance $\tfrac{1}{45}n(n-1)(n+3)$. For more information on these limits and how to derive them, there is a more recent paper here:
I like your solution, though (as always) since Catalan numbers showed up I’ll have to re-read the whole thing.
I am a bit surprised that I haven’t seen anyone take the approach of just solving for the probability of any two chords intersecting — using the Beta Formula (times 2 choose 1)– and then having linearity of expectations take you the rest of the way home for any n number of chords. It’s worth mentioning that the Beta Formula, while being an integral, has a very nice discrete probability derivation.
A link to my solution is below:
https://www.dropbox.com/s/2whl4k8f6eel96r/solution_robo_pizza2_.jpg?dl=0
Cheers
That’s a great question; I was thinking about making a separate post about this. It turns out that while using linearity of expectation as you describe gives the correct answer, the logic is flawed. Although the chords are independent, pairs of chords are not independent.
If the approach were valid, you could use the same argument to derive the entire probability distribution, which would be a binomial distribution. For example, the probability of having three intersections should be $(\tfrac{1}{3})^3 = \tfrac{1}{27}$. But if you use the approach I describe in my solution, you get a probability of $\tfrac{1}{15}$ instead!
Edit: I’m wrong about this! see comment below.
On the contrary, it is a primary feature of the Linearity of Expectations that independence is not required.
For review, you may consider reading the first 3 pages of
http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-042j-mathematics-for-computer-science-fall-2005/readings/ln14.pdf
Especially notice the quote on page 2 “The great thing about linearity of expectation is that no independence is required. This is really useful, because dealing with independence is a pain, and we often need to work with random variables that are not independent.”
And then see how it is applied in The Hat-Check Problem on page 3, using indicator random variables. The Hat Problem involves sampling without replacement (i.e not binomial).
I believe Bertsekas and Tsitsiklis cover this in some detail in their book on probability.
You’re absolutely right. You can indeed use linearity of expectation to find the expected number of intersections. No independence assumption is required there. I think I might edit my post to include this new answer — thanks!
Finding the general distribution is more difficult however because there independence matters.
Dear Professor Lessard,
I have recently gotten into trying to work on the Riddler problems posted weekly. I could not solve this past week’s problem and am now trying to work on understanding your solution that you have posted. I have several questions.
Firstly, why do you state that we can ignore the scenario of three intersecting chords? That’s a theoretically possible option I believe?
Secondly, how do you determine that V=2n+ p vertices?
Thank you very much in advance for your response,
Michael
We can ignore the case where three chords meet in a single point because the probability of that happening is vanishingly small. If you were to fix five of the six endpoints, there is only one point on the circumference where you can put the sixth endpoint so that all three chords intersect at the same point. Since there are infinitely many points on the circumference, the probability of that point being chosen is zero.
We have $n$ chords and $p$ intersections. Each intersection is a vertex ($p$ in total), and both endpoints of each chord is a vertex ($2n$ in total).
Thank you for the beautiful problem and solution!
I think there is a typo in the first approach (based on Euler’s formula), but the final answer (S = n + p + 1) is correct.
“V=2n+pV=2n+p vertices (count the ones on the circumference). Each internal intersection adds an edge, and we must count the arcs between cuts along the circumference as edges, so we have a total of E=3n+p.”
Correction: each internal intersection adds 2 edges, so E = 3n + 2p.
Fixed it. Thanks!
Hello Professor Lessard,
First, thanks for this great post!
I had a question about this paragraph:
“One final point of interest: the way in which the chords are randomly chosen is important! If we had used a different method, e.g. by picking a random radius and a random point on that radius, and then constructing a chord perpendicular to the radius through this chosen point, then we would obtain a different answer! The approach above works because the way the chords are chosen allows us to decouple how endpoints are chosen from how labels are assigned”
I’ve been reading into Bertrand’s Paradox, and literature on ‘solutions’, including Edwin Jaynes’ solution focusing on a solution that is both scale and translation invariant. Given alternative methods of empirically deriving the likelihood that two chords intersect, from what you wrote:
“The approach above works because the way the chords are chosen allows us to decouple how endpoints are chosen from how labels are assigned”
it reads like you chose this method because of the way it would “specify” the randomness. Can you speak more to how you chose this?
Given the result from the n=2 case and the results from Riordan’s generating function, it seems that there is a “correct” specification of randomness when empirically simulating the likelihood that two chords intersect, is there a way to assert that one methodology is more correct than another, as I understood you did when you implied that “picking a random radius and a random point on that radius, and then constructing a chord perpendicular to the radius through this chosen point” would result in an incorrect answer?
Thanks again!
Hi Miguel, thanks for the comment.
The method for randomly selecting the chords was specified in the problem statement, so I did what the problem asked. My point was merely that the way in which we choose chords is very important, and had they specified a different way of selecting the chords, the problem would have a different solution.
I did not mean to imply that there is one way of randomly selecting chords that is “more correct” than another. What I meant was that if you select the chords differently from how the problem says you should, you’ll get a different answer, and it would be “incorrect” because it’s not what the question asked for.
I hope this clears things up!