# Rug quality control This Riddler puzzle is about rug manufacturing. How likely are we to avoid defects if manufacture the rugs randomly?

A manufacturer, Riddler Rugs™, produces a random-pattern rug by sewing 1-inch-square pieces of fabric together. The final rugs are 100 inches by 100 inches, and the 1-inch pieces come in three colors: midnight green, silver, and white. The machine randomly picks a 1-inch fabric color for each piece of a rug. Because the manufacturer wants the rugs to look random, it rejects any rug that has a 4-by-4 block of squares that are all the same color. (Its customers don’t have a great sense of the law of large numbers, or of large rugs, for that matter.)

What percentage of rugs would we expect Riddler Rugs™ to reject? How many colors should it use in the rug if it wants to manufacture a million rugs without rejecting any of them?

Here is my solution:
[Show Solution]

## 7 thoughts on “Rug quality control”

1. John Snyder says:

Nice solution Laurent. I got essentially the same results by making the assumption that the 4×4 square colors were independent (which of course they’re not). There might be some small justification for this independence assumption along the lines of the 10% sampling rule typically seen in statistics. Since the number of possible rugs is astronomically large, a few million is an immaterial fraction of the total universe. I also ran a simulation of 2.5 million trials and got a probability of a 3 color 100×100 rug being rejected of 0.000634, which is generally consistent with the analytic solutions we both obtained.

1. Laurent says:

nice. Though keep in mind that by simulating this way (binomial trial with 2.5 million samples), your 95% confidence interval would be on the order of $1.96\sqrt{\frac{\hat p(1-\hat p)}{2.5\times 10^6}} \approx 0.000031$. Therefore your 95% confidence interval for $p$ is:
$0.000603 \le p \le 0.000665$If you wanted to figure out the digit that comes after that first 6, you would have to simulate on the order of a hundred million samples.

2. D says:

Have you considered estimating (co)variance? It allows some other approaches here — Chebyshev is immediate, but there are some other goodies.

I think it also gives some additional insights into what’s going on under the hood… unfortunately it requires pairwise comparisons which may overload your notation. Basically the idea is to set up indicator variables as you have (though I changed the notation), where

$Y = \mathbb I_1 + \mathbb I_2 + … + \mathbb I_{n}$

in the original problem $n = 96^2$ but it later becomes $n= 96^2 \cdot 1,000,000$

$E\big[Y\big] = E\big[\mathbb I_1\big] + E\big[\mathbb I_2 \big] + … + E\big[\mathbb I_{n}\big] = \sum_{k=1}^n E\big[\mathbb I_{n}\big]$

now if we take our $Y$ and square it, we have:

$Y^2 = \big(\sum_{k=1}^n \mathbb I_{k} \big) + \big(\sum_{k=1}^n \sum_{j\neq k}\mathbb I_{k} \mathbb I_{j} \big)$

note that for indicators $\mathbb I_{k}^2 = \mathbb I_{k}$ (i.e. idempotence).

so

$E\big[Y^2\big] = \big(\sum_{k=1}^n E\big[\mathbb I_{k}\big] \big) + \big(\sum_{k=1}^n \sum_{j\neq k}E\big[\mathbb I_{k} \mathbb I_{j}\big] \big)$

The key idea is that the pairwise comparisons are $n(n-1)$ in total in that second summation. However each $k$ has at most an $8 x 8$ neighborhood of dependencies. So for each $k$ there are (at least) $n \cdot (n – 1 – 64)$ independent experiments (which for our interests means zero covariance), and a conservative $65 \cdot n$ that are dependent. Using the above allows us to upper bound variance. I took a further lazy approach to do so, but you may have a nicer idea in mind.

The key insight is that the dependencies neighborhood only grows linearly while variance of course has a quadratic number of terms. And of course we are dealing with large $n$ or various forms here.

Short of a major arithmetic error on my end, I think you can use Chebyshev alone to verify that r = 6 is the solution.

As icing on the cake: with the variance upper bound in hand, plus the mean, and a bit of modeling we can then apply Stein-Chen and show a very small total variation distance between the Poisson distribution and ours for the 1MM rugs case with $\lambda := E\big[Y\big]$, for any $r \geq 5$. Stein-Chen is probably way outside the scope but it’s quite satisfying to show that the resulting distribution is Poisson, with a de minimis error term.

3. Semyon says:

If you assume independence, the problem is easy. You have 97*97 squares and each has (1/3)^15 chance of leading to rejection. So the chance of rejection is just 1-exp(-L), where L is the Poisson parameter = 97*97*(1/3)^15.
To generate enough events for the simulation to be reasonably fast, I went to 2 colors. There, the simulated answer was lower than under independence assumption. Sort of like the number of independent squares is actually 86*86. I applied this penalty factor to the answer assuming independence. Likely over-corrected a bit

4. D says:

On estimating / bounding variance, I’d refine what I said somewhat — and I’ve still gone a bit too quickly through here…

I’d say for any given qualifying 4×4 (16 tiles) square, you have a neighborhood of dependencies involved in forming 4×4 squares — and that neighborhood involves 3 rows above the square, 3 below, and 3 to the left and 3 to the right. so this is 3 + 4 + 3 = 10. That is we, at most embedded it in a 10 x 10 = 100 tile square

$100 – 16 = 84$ potential other 4×4 blocks to worry about in terms of covariance calculations. However the ‘outer rim’ has $8 + 8 + 8 + 8 + 4= 36$ tiles that have at most 4 overlaps
so we have $E\big[\mathbb I_k \mathbb I_j\big] = \big(\frac{1}{r}\big)^{16}\big(\frac{1}{r}\big)^{16-4} =\big( \frac{1}{r}\big)^{16}\big(\frac{1}{r}\big)^{12}$ for each of these.

This leaves $84-36 = 48$ more troublesome cases… we could successively re-apply and refine the above logic, though I think we can skate by and say for each of these: $E\big[\mathbb I_k \mathbb I_j\big] \leq \big(\frac{1}{r}\big)^{16}\big(\frac{1}{r}\big)^{4}$
that is there must be at least 4 stochastically independent events introduced though most of these have much more independents… so we are overestimating variance here but it works an upper bound on variance — what we want– and it should be sufficiently tight as well.

5. Laurent says:

Hi everyone — thank you all for your comments! I rewrote my solution, and I now compute both upper and lower bounds for the probability of rejection. I decided to use Bonferroni inequalities rather than the Markov/Chebyshev inequality approach because it seems to make more sense in this case.

6. D says:

I liked the Bonferroni approach. The issue is that the bounds are still too weak to determine between r colors = $\{5, 6, 7\}$. I cannot think of an elementary approach to determine the right result. (I found some bugs in my earlier math / code and have determined that Chebyshev is too weak unfortunately.) I am rather curious how the official solution will be explained.

The only two tools I could really think of here to deal with the dependencies are (a) possibly a martingale way or (b) stein-chen method — which I slipped in at the end though I’m thinking it won’t make much sense.

I dropped in my Variance estimate below. I have my fingers crossed that there are no nasty bugs in it this time — also hopefully it isn’t too long. Note that since the variance estimate is close to the expected value, this is suggestive of the Poisson distribution. Unfortunately it takes a lot of machinery to make the jump from suggestive to verifying that the Poisson approximation is ‘good’.

– – – –
Btw — @LL: do you have anything like “spoiler” tags that I could nest the long response below under? Some markdown enabled environments have these. The benefit is that long writeups don’t take up too much whitespace unless a user clicks a button labeled “spoiler” or whatever other label.

– – – –

for the initial problem we have

$r$ different colors (each equally likely)

we are interested in 4×4 blocks of having exactly the same color — that is the rare event we want a probability distribution for. The issue is there are overlaps / dependencies. But first the easy thing to do is calculate expected value.

For the initial problem
$n=96^2 = 9216$

$Y = \mathbb I_1 + \mathbb I_2 + … + \mathbb I_{n}$

$\lambda = E\big[Y\big] = E\big[\mathbb I_1\big] + E\big[\mathbb I_2 \big] + … + E\big[\mathbb I_{n}\big] = \sum_{k=1}^n E\big[\mathbb I_{n}\big] = n \cdot r\big(\frac{1}{r}\big)^{\text{4 x 4}} = n \cdot \big(\frac{1}{r}\big)^{15}$

we need an estimate of variance. Note this is not a torus, so there are boundary issues. However our estimate of $E\big[Y^2\big]$ is comprised of

$E\big[Y^2\big]= \big(\sum_{k=1}^n E\big[\mathbb I_{k}^2\big] \big) + \big(\sum_{k=1}^n \sum_{j\neq k}E\big[\mathbb I_{k} \mathbb I_{j}\big] \big) = \big(\sum_{k=1}^n E\big[\mathbb I_{k}\big] \big) + \big(\sum_{k=1}^n \sum_{j\neq k}E\big[\mathbb I_{k} \mathbb I_{j}\big] \big) = \lambda + \big(\sum_{k=1}^n \sum_{j\neq k}E\big[\mathbb I_{k} \mathbb I_{j}\big] \big)$

for $j \neq k$, we have

$E\big[ \mathbb I_{k}\mathbb I_{j}\big]_{\mathbb I_{k}\perp \mathbb I_{j} }\leq E\big[ \mathbb I_{k}\mathbb I_{j}\big]$

which reads: the dependencies results in higher values for $E\big[ \mathbb I_{k}\mathbb I_{j}\big]$, and the independent case takes on the minimum value. Hence if we ignore boundary issues and over-estimate the dependencies, then that is fine for upperbounding variance of $Y$.

In the independent case we have:

$E\big[ \mathbb I_{k}\mathbb I_{j}\big]_{\mathbb I_{k}\perp \mathbb I_{j} } = E\big[ \mathbb I_{k}\big] E\big[\mathbb I_{j}\big] = \Big(\big(\frac{1}{r}\big)^{15}\Big)^2 = \big(\frac{1}{r}\big)^{30}$

as for depedencies, we’ll take on a crude upper bound. Suppose we pick some center point (to avoid boundary condition nuisances) and look at the 4×4 block below and to the right of it — let’s call that point $k$ and shade the region it in.

We can now isolate dependencies to be in a neighborhood of : at most 3 dots to down, at most 3 dots up and and most 3 dots to the right or left. If we include the row and column we are oriented on this is a $\text{7 x 7} = 49$ region. Of those 49 points, one of them is our reference point and 4 more are corner points which have at exactly one point in common with our shaded region — having one point in common means zero covariance (why? — note this is but a minor point in computing the bound that follows) and hence we only have $44$ dependencies to worry about. T

The ‘outer rim’ has its corners removed, but otherwise has $5 + 5 + 5 + 5 = 20$ indicator variables associated with it and each of these has overlap(/dependencies) of at most 4 trials in common with our shaded region. Hence we have

$E\big[ \mathbb I_{k}\mathbb I_{j}\big]_{\text{outer rim} } = E\big[ \mathbb I_{k}\big]E\big[\mathbb I_{j}\big \vert\mathbb I_{k}=1 \big]_{\text{outer rim} } \leq \big(\frac{1}{r}\big)^{15}\big(\frac{1}{r}\big)^{11} = \big(\frac{1}{r}\big)^{26}$

we thus have $24$ remaning cellls that are more strongly dependent. At most these have *all* trials in common with our shaded region except $4$, giving us:

$E\big[ \mathbb I_{k}\mathbb I_{j}\big]_{\text{inner area} } = E\big[ \mathbb I_{k}\big]E\big[\mathbb I_{j}\big \vert\mathbb I_{k}=1 \big]_{\text{inner area} } \leq \big(\frac{1}{r}\big)^{15}\big(\frac{1}{r}\big)^{4} = \big(\frac{1}{r}\big)^{19}$

(This is exact for a handful of cases and in general a gross over-estimate)

now when estimating variance we had

$E\big[Y^2\big]= \lambda + \big(\sum_{k=1}^n \sum_{j\neq k}E\big[\mathbb I_{k} \mathbb I_{j}\big] \big)$

the summation has $2\cdot\binom{n}{2} = n(n-1)$ terms in it. That is for each of $n$ indicators, there are $(n-1)$ others to compare with when doing our computations for $Y’s$ second moment.

However the above analysis tells us that for each of $n$ indicators, there are (at least)
$n -45 = (n-1) -44$

terms that have zero covariance and $44$ terms that need covariance special handling. Thus we have

$E\big[Y^2\big]= \lambda + \big(\sum_{k=1}^n \sum_{j\neq k}E\big[\mathbb I_{k} \mathbb I_{j}\big] \big) \leq \lambda + n\Big((n-45)\cdot E\big[ \mathbb I_{k}\big] E\big[\mathbb I_{j}\big] + 20 \cdot E\big[ \mathbb I_{k}\mathbb I_{j}\big]_{\text{outer rim} } + 24\cdot E\big[ \mathbb I_{k}\mathbb I_{j}\big]_{\text{inner area} }\Big)$

$E\big[Y^2\big] \leq \lambda + (n^2-45n)\cdot E\big[ \mathbb I_{k}\big]^2 + 20 \cdot n \big(\frac{1}{r}\big)^{26} + 24\cdot n \big(\frac{1}{r}\big)^{19}$

thus

$\text{Var}\big(Y\big) = E\big[Y^2\big] – E\big[Y\big]^2 \leq \Big(\lambda + (n^2-45n)\cdot E\big[ \mathbb I_{k}\big]^2 + 20 \cdot n\big(\frac{1}{r}\big)^{26} + 24\cdot n \big(\frac{1}{r}\big)^{19}\Big) -\Big(n \cdot E\big[\mathbb I_{n}\big]\Big)^2$

$\text{Var}\big(Y\big) \leq \lambda + (n^2-45n)\cdot E\big[ \mathbb I_{k}\big]^2 + 20 \cdot n \big(\frac{1}{r}\big)^{26} + 24\cdot n \big(\frac{1}{r}\big)^{19} – n^2 \cdot E\big[\mathbb I_{n}\big]^2 = \lambda -45n\cdot E\big[ \mathbb I_{k}\big]^2 + 20 \cdot n \big(\frac{1}{r}\big)^{26} + 24\cdot n \big(\frac{1}{r}\big)^{19}$

$\text{Var}\big(Y\big) \leq \lambda + 24\cdot n \big(\frac{1}{r}\big)^{19} +20 \cdot n \big(\frac{1}{r}\big)^{26} -45n\cdot\big(\frac{1}{r}\big)^{30}$

we can qualitatively note that the variance estimate we have for $Y$ looks close to the mean for $Y$, which is something of a special feature for Poisson random variables.

– – –
now using the magic of a Stein Chen we can compute the total variation distance between $Y$ and Poisson Random Variable with parameter $\lambda$, which gives

$d_{TV}(Y, Z) \leq 2 \cdot n \cdot \big(\frac{\lambda}{n}\big)^2 +\Big(\text{Var}\big(Y\big) -\lambda\Big) = 2 \cdot n \cdot \big( \big(\frac{1}{r}\big)^{30}\big) +\Big(24\cdot n \big(\frac{1}{r}\big)^{19} +20 \cdot n \big(\frac{1}{r}\big)^{26} -45n\cdot\big(\frac{1}{r}\big)^{30}\Big)$

$d_{TV}(Y, Z) \leq 2 \cdot n \big(\frac{1}{r}\big)^{30} + 24\cdot n \big(\frac{1}{r}\big)^{19} +20 \cdot n \big(\frac{1}{r}\big)^{26} – 45n\cdot\big(\frac{1}{r}\big)^{30} \leq 46\cdot n \big(\frac{1}{r}\big)^{19}$

$\begin{bmatrix} & r=5 & r=6 & r=7\\ \text{total variation distance upper bound }(n = 96^2 \cdot 1,000,000): &0.02174 & 0.00068 & 3.638247 \cdot 10^{-05}\\ \text{Poisson estimate for 1 MM rugs case : } p(0, \lambda_r) = \ & 0.73934 & 0.9805900 & 0.9980606 \end{bmatrix}$