The following problems appeared in The Riddler. They involve randomly picking points on a circle or sphere and seeing if the resulting shape contains the center or not.
Problem 1: Choose three points on a circle at random and connect them to form a triangle. What is the probability that the center of the circle is contained in that triangle?
Problem 2: Choose four points at random (independently and uniformly distributed) on the surface of a sphere. What is the probability that the tetrahedron defined by those four points contains the center of the sphere?
Here is my solution to both problems:
[Show Solution]
This problem also appeared as Problem A-6 in the 1992 William Lowell Putnam math competition. After writing this blog post, I also discovered an amazing video about this problem by @3Blue1Brown. The video is exceptionally clear and well-produced; I highly recommend it! The solution I present below is similar to the one in the video, except that I also included a section at the end that provides more mathematical details.
Both the 2D and 3D versions of the problem can be solved in a similar fashion, using the same key observation. I will point out this observation and then prove it mathematically. We’ll start with the first problem because it’s a bit simpler, but the logic is the same for both.
Problem 1: triangle and circle
Instead of picking three points on the circle at random, the symmetry in the problem allows us (without any loss of generality) to fix the first point and then pick the remaining two uniformly at random. Let’s suppose that the first point is fixed to be at the top of the circle. For the two remaining points, each one has a “mirror point” that lies directly opposite it on the other side of the circle. In mathspeak, a point and its mirror point are said to be diametrically opposed.
Since the points on the circle are chosen uniformly at random, each point is as likely to be chosen as its diametrically opposed counterpart. Therefore, instead of picking the remaining two points uniformly at random, we’ll do the following equivalent thing:
- Pick two lines through the center of the circle uniformly at random.
- For each line, pick one of its two endpoints at random.
For each choice of two diameters ($BB’$ and $CC’$), there are four possible triangles we can make depending on which endpoints of the diameters are chosen: $ABC$, $AB’C$, $ABC’$, and $AB’C’$. Here is a diagram illustrating the choices of triangles.
One of the four triangles contain the center point $O$. This is no accident! No matter which two diameters $BB’$ and $CC’$ we select, exactly one of the four resulting triangles will contain the center point $O$ almost surely. Moreover, each of these four triangles is equally likely because each choice of endpoint ($B$ or $B’$ and $C$ or $C’$) is equally likely. It follows that the probability that the triangle will contain the center point $O$ is $\frac{1}{4}$.
Problem 2: tetrahedron and sphere
For this problem, we can apply the exact same logic as in the 2D case: fix one point to be at the top of the sphere and select the remaining three points by first picking three diameters at random, and then for each diameter choosing one of its endpoints at random. Once again, it turns out that exactly one of the 8 possible tetrahedra will contain the center of the sphere. Each triangle is equally likely, so the probability that the random tetrahedron contains the center of the sphere is $\frac{1}{8}$.
Mathematical proof
The results above rely on the observation that exactly one of the 4 triangles (in the 2D case) or one of the 8 tetrahedra (in the 3D case) contains the center of the circle or sphere. We will now prove this fact more rigorously by using the notion of barycentric coordinates. Suppose the vertices of the tetrahedron are located at $\vec{p}$, $\vec{q}$, $\vec{r}$, $\vec{s}$. Given a point $\vec{x}$ (not necessarily on the surface of the sphere), we can represent it as a linear combination:
\[
\vec{x} = \lambda_1 \vec{p} + \lambda_2 \vec{q} + \lambda_3 \vec{r} + \lambda_4 \vec{s}
\]The quadruple $(\lambda_1,\lambda_2,\lambda_3,\lambda_4)$ are the barycentric coordinates of $\vec{x}$ with respect to the tetrahedron. This representation is not unique, but it becomes unique if we add a normalizing constraint such as $\lambda_1+\lambda_2+\lambda_3+\lambda_4=1$. A key property of barycentric coordinates is that $\vec{x}$ lies inside the tetrahedron if and only if all coordinates have the same sign. The reason is that the interior of the tetrahedron is also the convex hull of its vertices.
We can use the property above to solve our problem. Let $\vec{o}$ be the center of the sphere and suppose it’s located at the origin. This means that the point diametrically opposed to any point $\vec{y}$ is simply $-\vec{y}$. As before, we’ll fix $\vec{p}$. The 8 possible (equally likely) tetrahedra are formed by using the vertices:
\[
\vec{p}
\quad\text{and}\quad
(\vec{q}\text{ or }-\vec{q})
\quad\text{and}\quad
(\vec{r}\text{ or }-\vec{r})
\quad\text{and}\quad
(\vec{s}\text{ or }-\vec{s})
\]We want to test whether $\vec{x}=\vec{o}$ is inside the tetrahedron or not. Therefore, the 8 equations we should solve for the 8 possible tetrahedra are:
\begin{align}
\vec{o} &= \lambda_1 \vec{p}+\lambda_2 \vec{q}+\lambda_3 \vec{r}+\lambda_4 \vec{s}, &
\vec{o} &= \lambda_1 \vec{p}+\lambda_2 \vec{q}+\lambda_3 \vec{r}-\lambda_4 \vec{s}, \\
\vec{o} &= \lambda_1 \vec{p}+\lambda_2 \vec{q}-\lambda_3 \vec{r}+\lambda_4 \vec{s}, &
\vec{o} &= \lambda_1 \vec{p}+\lambda_2 \vec{q}-\lambda_3 \vec{r}-\lambda_4 \vec{s}, \\
\vec{o} &= \lambda_1 \vec{p}-\lambda_2 \vec{q}+\lambda_3 \vec{r}+\lambda_4 \vec{s}, &
\vec{o} &= \lambda_1 \vec{p}-\lambda_2 \vec{q}+\lambda_3 \vec{r}-\lambda_4 \vec{s}, \\
\vec{o} &= \lambda_1 \vec{p}-\lambda_2 \vec{q}-\lambda_3 \vec{r}+\lambda_4 \vec{s}, &
\vec{o} &= \lambda_1 \vec{p}-\lambda_2 \vec{q}-\lambda_3 \vec{r}-\lambda_4 \vec{s}.
\end{align}But these equations have very similar solutions! Indeed, if $(\lambda_1,\lambda_2,\lambda_3,\lambda_4)$ solves the first equation, then the solutions to the equations above are:
\begin{align}
&(\lambda_1,\lambda_2,\lambda_3,\lambda_4), &
&(\lambda_1,\lambda_2,\lambda_3,-\lambda_4), \\
&(\lambda_1,\lambda_2,-\lambda_3,\lambda_4), &
&(\lambda_1,\lambda_2,-\lambda_3,-\lambda_4), \\
&(\lambda_1,-\lambda_2,\lambda_3,\lambda_4), &
&(\lambda_1,-\lambda_2,\lambda_3,-\lambda_4), \\
&(\lambda_1,-\lambda_2,-\lambda_3,\lambda_4), &
&(\lambda_1,-\lambda_2,-\lambda_3,-\lambda_4),
\end{align}respectively. It’s clear that exactly one of these quadruples has all entries with the same sign. In other words, exactly one of the 8 tetrahedra contains the center of the sphere. The argument above also works in the 2D case using a similar notion of 2D barycentric coordinates.
Hi Laurent, long time reader here. Just curious, what program do you use to create your geometric figures? I’m looking to start documenting my solutions as well…
For geometric figures in 2D and 3D, I use https://www.geogebra.org/
That’s what I used to draw the circles for this post.
If I’m displaying data that comes from code, I’ll use Matlab or Julia (usually with PyPlot).
If I’m drawing something simple and cartoonish, I’ll use powerpoint.
Good luck!
Hi Laurent – I was working on points of concurrency and ‘centers’ of triangles with my geometry class when I came across the Riddler and I want to know what you think about this line of reasoning. If the triangle formed is an acute triangle, the center of the circle (circumcenter) will fall inside the triangle. If the triangle is obtuse, the center will fall outside. (A right triangle is a special case and will not impact the probability) The size of the largest angle dictates acute/obtuse and the range of values for the largest angle of a triangle is[60,180). Out of this range of 120 degrees, 30 will yield acute triangles; 30/120=1/4.
Hi Peter,
Thanks for writing! I’ll break down your argument and go through it step by step:
1) “if the triangle formed is acute, the center of the circle (circumcenter) falls inside the triangle. If the triangle is obtuse, the center falls outside.” This is correct. Also relevant is the fact that any three points determine exactly one circle. So if you pick three points on a circle and make a triangle out of them, the circle you started with will indeed be the circumcircle of that triangle.
2) “the size of the largest angle dictates whether the triangle is acute or obtuse and the range of values for the largest angle of a triangle is [60,180).” This is also true!
3) Out of this range of 120 degrees, 30 will yield acute triangles; 30/120 = 1/4. There is an error here. What you’re implicitly assuming by computing 30/120 = 1/4 is that all angles are equally likely. It’s tempting to assume that all angles are equally likely because, after all, you picked the points at random! But this is not the case. Take a different scenario; you’re rolling two dice and write down the sum. The sum can be 2,3,…,12. What is the probability that the sum is 7? Using your logic, there are 11 possible sums, so the probability should be 1/11. But the probability is actually 1/6 because there are more ways of rolling a 7 than there are of rolling any of the other numbers. When rolling dice, yes both are rolled at random, but the random variable of interest isn’t the value on each die; it’s the value of a function of both of them. You can get a 7 in six different ways: (1+6), (2+5), (3+4), (4+3), (5+2), (6+1). Meanwhile there are 36 different ways to roll two dice (6 values per die). This leads to 6/36 = 1/6 for the probability of rolling a 7. The same thing is happening here with the triangle — the largest angle is actually a function of the two interior angles. The interior angles are chosen uniformly at random, much like the dice, but that doesn’t mean that the largest angle of the triangle is also uniformly distributed. To show you what I mean, I simulated 100,000 triangles by picking random points on the circle, computed the largest angle for each one, and then plotted a histogram of the largest angles. Here is the distribution of points that I found:
As you can see, it’s not uniform! getting a largest angle of around 175 deg is far less likely to occur than getting an angle of around 90 deg. It so happens that your calculation produced the correct answer, but this is a coincidence. The distribution of angles is such that it’s made up of two linear pieces and things average out, but in general, in order to use your approach, you should compute this distribution of angles explicitly and then calculate the probability in the same way as I did with the dice.
The formal tool to use here is calculus (the solution can be found by computing an integral), but this might not be what you had in mind if you were looking for a quick and elegant solution to the problem. A better way to do it is to prove that the distribution I plotted above actually consists of two linear segments that peak at 90 degrees. The probability you seek is actually the area of all the bars between 30 and 90 degrees divided by the total area of all the bars (integration is, after all, just finding areas underneath curves!). A bit of geometry and area calculations will reveal that this ratio of areas is indeed 1/4 !
Thanks Laurent – I didn’t think the distribution was uniform but I didn’t have any intuition as to its shape, your diagram is terrific and the area argument is clear. I’ll next think about a justification for the 2 linear segments…their equations are y=x/1800 on [0,90] and y=1/60-1/5400(x-90) on [90,180].
I am fairly confident that problem one is the complement of:
– – –
what is the probability that 3 uniform random points on (the perimeter of) a circle are inside the same half of the circle? I.e. 1 minus said probability gives the probability that the circle’s center is in the convex hull of these 3 points, which is the problem at hand.
– – –
This complementary problem can be nicely generalized to the probability that $t$ points are in the same half of the circle.
It feels a bit fuzzy as I solved the problem over a year ago but I still have misc. pieces of it in a jupyter notebook sitting around. Similar to your approach, I fixed the first point for free. However, I chose to discretize things so that the circle’s boundary is merely $n$ (where $n$ is some even number — ideally a power of 2) equal sized ‘chunks’ that we sample at random (and with replacement.) This can create a tiny bit of awkwardness but it goes away in the end after passing a limit. After breaking symmetry, you get an absorbing markov chain with a nice triangular structure. E.g. for n = 16, you have
$\mathbf A^{t-1} = \left[\begin{matrix}\frac{1}{16} & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\\frac{1}{8} & \frac{1}{8} & 0 & 0 & 0 & 0 & 0 & 0 & 0\\\frac{1}{8} & \frac{1}{8} & \frac{3}{16} & 0 & 0 & 0 & 0 & 0 & 0\\\frac{1}{8} & \frac{1}{8} & \frac{1}{8} & \frac{1}{4} & 0 & 0 & 0 & 0 & 0\\\frac{1}{8} & \frac{1}{8} & \frac{1}{8} & \frac{1}{8} & \frac{5}{16} & 0 & 0 & 0 & 0\\\frac{1}{8} & \frac{1}{8} & \frac{1}{8} & \frac{1}{8} & \frac{1}{8} & \frac{3}{8} & 0 & 0 & 0\\\frac{1}{8} & \frac{1}{8} & \frac{1}{8} & \frac{1}{8} & \frac{1}{8} & \frac{1}{8} & \frac{7}{16} & 0 & 0\\\frac{1}{8} & \frac{1}{8} & \frac{1}{8} & \frac{1}{8} & \frac{1}{8} & \frac{1}{8} & \frac{1}{8} & \frac{1}{2} & 0\\\frac{1}{16} & \frac{1}{8} & \frac{3}{16} & \frac{1}{4} & \frac{5}{16} & \frac{3}{8} & \frac{7}{16} & \frac{1}{2} & 1\end{matrix}\right]^{t-1}$
(You start in the top left corner, i.e. multiply $\mathbf A^{t-1} \mathbf e_1$ i.e. 1st standard basis vector and are interested in probability of being absorbed into bottom right cell.)
There is an awful lot of nice structure in this problem, so I managed to extract the scalar equation:
$\big(\frac{1}{2}\big)^{t-1} \frac{n}{2} – \left(\frac{1}{2} – \frac{1}{n}\right)^{t – 1} \left(\frac{n}{2} – 1\right)$
after passing a limit as $n \to \infty$ you get
$\frac{t}{2^{t-1}}$ which in case of this problem is $\frac{3}{2^{3-1}}= \frac{3}{4}$
which of course is the complement of $\frac{1}{4}$
Very cool! I think you’re right about the half-circle interpretation, and it’s obvious generalization to higher dimensions (probability that 4 points are in the same hemisphere, in 3D).
What bugged me is I remember thinking how clever I was to use linear algebra methods to solve this. However, I couldn’t easily generalize this to higher dimensions.
Then I found this page:
http://www.mathpages.com/home/kmath327/kmath327.htm
It seamlessly generalizes to higher dimensions, using simple solutions of linear equations (using Vandermonde matrices no less). What could be simpler?
It does some fancier stuff later on, but really a very nice result.
It looks like this (complementary) problem has, in effect, come up again with this week’s Riddler Classic
https://fivethirtyeight.com/features/what-comes-after-840-the-answer-may-surprise-you/
Another year later, by far the best solution I know of to this problem is to carefully construct an equivalent problem.
The idea is to recognize that every satisfying configuration has, with probability 1, exactly 2 points acting as ‘outer points’ (i.e. if we deleted these 2 ‘outer points’ points all the other $N-2$ points would be on the same remaining ‘half perimeters’).
So we may (arbitrarily) label each one of these N points as $1, 2, …, N$ and set up a random variable, $M$, to count the number of valid end points (where ‘valid’ means all points are located in one half of the circle or equivalenlty that we can pick a diameter such that all points are only on one side of the newly halved circle). Somewhat obviously $M$ takes on the value 0 with probability $1-p$ — this occurs when the experiment results in a configuration where there are no valid diameter ‘cuts’– and the the value $2$ with probability $p$ — which is what we are after.
If we look at $\frac{1}{2}M$, we see this is a Bernouli random variable, so
$p = E\big[\frac{1}{2}M\big] = \frac{1}{2} \cdot E\big[M\big]$
The next step is to write $M$ as a sum of (highly dependent) indicator random variables.
$M = \mathbb 1_{A_1} + \mathbb 1_{A_2} + … + \mathbb 1_{A_N} = \sum_{j=1}^N \mathbb 1_{A_j}$
where $A_j$ is the event that the $j$th point is an ‘outer point’ of a valid configuration. Despite the dependencies, we may apply linearity of expectations to the above to see
$E\big[M\big] = E\big[\sum_{j=1}^N \mathbb 1_{A_j}\big] = \sum_{j=1}^N E\big[\mathbb 1_{A_j}\big] = N \cdot E\big[\mathbb 1_{A_j}\big] $
to solve for $ E\big[\mathbb 1_{A_j}\big]$ (which by symmetry is the same for each $j$), consider that for any given location of point $j$ we can make a cut along the diameter going through the center and $j$ to get a line segment with point $j$ in the middle. (Formally this is a conditional expectations argument.)
The probability then of a satisfying configuration is the probability of the union (i) that all remaining $n-1$ points land to the left of this mid point and (ii) that all remaining $n-1$ points land to the right of this mid point. These are mutually exclusive events so their probabilities add, and they are symmetric — i.e. the probability of each event is
$\frac{1}{2}^{N-1}$
so
$ E\big[\mathbb 1_{A_j}\big] = (\frac{1}{2})^{N-1} + (\frac{1}{2})^{N-1} = 2 \cdot 2^{-N+1} = (\frac{1}{2})^{N-2}$
Putting this all together gives
$p = \frac{1}{2} \cdot \Big(E\big[M\big]\Big) = \frac{1}{2} \cdot \Big(N \cdot E\big[\mathbb 1_{A_j}\big]\Big) = \frac{1}{2} \cdot \Big(N \cdot (\frac{1}{2})^{N-2}\Big) = \frac{N}{2^{N-1}}$
It looks like this (complementary) problem has, in effect, come up again with this week’s Riddler Classic
https://fivethirtyeight.com/features/what-comes-after-840-the-answer-may-surprise-you/
Another year later, by far the best solution I know of to this problem is to carefully construct an equivalent problem.
The idea is to recognize that every satisfying configuration has, with probability 1, exactly 2 points acting as ‘outer points’ (i.e. if we deleted these 2 ‘outer points’ points all the other $N-2$ points would be on the same remaining ‘half perimeter’).
So we may (arbitrarily) label each one of these N points as $1, 2, …, N$ and set up a random variable, $M$, to count the number of valid end points (where ‘valid’ means all points are located in one half of the circle or equivalenlty that we can pick a diameter such that all points are only on one side of the newly halved circle). Somewhat obviously $M$ takes on the value 0 with probability $1-p$ — this occurs when the experiment results in a configuration where there are no valid diameter ‘cuts’– and the the value $2$ with probability $p$ — which is what we are after.
If we look at $\frac{1}{2}M$ we see this is a Bernouli random variable,
so
$p = E\big[\frac{1}{2}M\big] = \frac{1}{2} \cdot E\big[M\big]$
The next step is to write $M$ as a sum of (highly dependent) indicator random variables.
$M = \mathbb 1_{A_1} + \mathbb 1_{A_2} + … + \mathbb 1_{A_N} = \sum_{j=1}^N \mathbb 1_{A_j}$
where $A_j$ is the event that the $j$th point is an ‘outer point’ of a valid configuration. Despite the dependencies, we may apply linearity of expectations to the above to see
$E\big[M\big] = E\big[\sum_{j=1}^N \mathbb 1_{A_j}\big] = \sum_{j=1}^N E\big[\mathbb 1_{A_j}\big] = N \cdot E\big[\mathbb 1_{A_j}\big] $
to solve for $ E\big[\mathbb 1_{A_j}\big]$ (which by symmetry is the same for each $j$), consider that for any given location of point $j$ we can make a cut along the diameter going through the center and $j$ to get a line segment with point $j$ in the middle. (Formally this is a conditional expectations argument.)
The probability then of a satisfying configuration is the probability of the union (i) that all remaining $n-1$ points land to the left of this mid point and (ii) that all remaining $n-1$ points land to the right of this mid point. These are mutually exclusive events so their probabilities add, and they are symmetric — i.e. the probability of each event is
$\frac{1}{2}^{N-1}$
so
$ E\big[\mathbb 1_{A_j}\big] = (\frac{1}{2})^{N-1} + (\frac{1}{2})^{N-1} = 2 \cdot 2^{-N+1} = (\frac{1}{2})^{N-2}$
Putting this all together gives
$p = \frac{1}{2} \cdot \Big(E\big[M\big]\Big) = \frac{1}{2} \cdot \Big(N \cdot E\big[\mathbb 1_{A_j}\big]\Big) = \frac{1}{2} \cdot \Big(N \cdot (\frac{1}{2})^{N-2}\Big) = \frac{N}{2^{N-1}}$