{"id":2381,"date":"2018-08-11T13:29:56","date_gmt":"2018-08-11T18:29:56","guid":{"rendered":"https:\/\/laurentlessard.com\/bookproofs\/?p=2381"},"modified":"2018-08-15T18:26:27","modified_gmt":"2018-08-15T23:26:27","slug":"rug-quality-control","status":"publish","type":"post","link":"https:\/\/laurentlessard.com\/bookproofs\/rug-quality-control\/","title":{"rendered":"Rug quality control"},"content":{"rendered":"<p>This <a href=\"https:\/\/fivethirtyeight.com\/features\/where-on-earth-is-the-riddler\/\">Riddler puzzle<\/a> is about rug manufacturing. How likely are we to avoid defects if manufacture the rugs randomly?<\/p>\n<blockquote><p>\nA manufacturer, Riddler Rugs\u2122, produces a random-pattern rug by sewing 1-inch-square pieces of fabric together. The final rugs are 100 inches by 100 inches, and the 1-inch pieces come in three colors: midnight green, silver, and white. The machine randomly picks a 1-inch fabric color for each piece of a rug. Because the manufacturer wants the rugs to look random, it rejects any rug that has a 4-by-4 block of squares that are all the same color. (Its customers don\u2019t have a great sense of the law of large numbers, or of large rugs, for that matter.)<\/p>\n<p>What percentage of rugs would we expect Riddler Rugs\u2122 to reject? How many colors should it use in the rug if it wants to manufacture a million rugs without rejecting any of them?\n<\/p><\/blockquote>\n<p>Here is my solution:<br \/>\n<a href=\"javascript:Solution('soln_riddler_rugs','toggle_riddler_rugs')\" id=\"toggle_riddler_rugs\">[Show Solution]<\/a><\/p>\n<div id=\"soln_riddler_rugs\", style=\"display: none\">\nSince every possible rug is equally likely to occur, the probability of a rug being rejected is equal to the total number of defective rugs divided by the total number of rugs. Let&#8217;s assume the rug is $n\\times n$, there are $d$ distinct color choices, and we&#8217;re looking to avoid $m\\times m$ same-color patches. There are $N = (n-m+1)^2$ different $m\\times m$ patches on the rug. For easier bookkeeping, let&#8217;s number these using a single index $1 \\le i \\le N$. Let&#8217;s define the following event:<br \/>\n\\[<br \/>\nX_{i} = \\left\\{<br \/>\n\\begin{array}{l}\\text{the }i^\\text{th} \\text{ }m\\times m \\text{ patch}\\\\\\text{is all the same color}\\end{array}\\right.<br \/>\n\\]The union of these events $X = \\bigcup_{i=1}^N X_{i}$ describes the case where there is at least one same-color patch on the rug and therefore the rug must be rejected. We seek to compute the probability $\\mathbb{P}(X)$.<\/p>\n<p>The <a href=\"https:\/\/en.wikipedia.org\/wiki\/Inclusion%E2%80%93exclusion_principle\">inclusion-exclusion principle<\/a> allows us to express $X$ in terms of the $X_i$, which will be useful because of the inherent symmetry present in the problem. Specifically, each $X_i$ has the same probability of occurrence! If we define the probabilities:<br \/>\n\\begin{align}<br \/>\nS_1 &#038;= \\sum_{1\\le i \\le N} \\mathbb{P}(X_i) \\\\<br \/>\nS_2 &#038;= \\sum_{1\\le i\\lt j\\le N} \\mathbb{P}(X_i \\cap X_j) \\\\<br \/>\n&#038;\\,\\,\\,\\vdots \\\\<br \/>\nS_k &#038;= \\sum_{1\\le i_1 \\lt\\cdots\\lt i_k\\le n} \\mathbb{P}(X_{i_1} \\cap \\cdots \\cap X_{i_k})<br \/>\n\\end{align}then the inclusion-exclusion principle states that:<br \/>\n\\[<br \/>\n\\mathbb{P}(X) = \\sum_{k=1}^N (-1)^{k-1}S_k<br \/>\n\\]We mentioned that the individual probabilities $\\mathbb{P}(X_i)$ are easy to compute. Indeed, each color occurs with probability $1\/d$ and there are $m^2$ cells in each patch. So the probability that all cells are of a particular color is $1\/d^{m^2}$. Multiplying by $d$ to account for the $d$ possible colors, we obtain:<br \/>\n\\[<br \/>\n\\mathbb{P}(X_i) = \\frac{1}{d^{m^2-1}}\\qquad\\text{for all }i<br \/>\n\\]Therefore, $S_1 = N\/d^{m^2-1}$. Setting $n=100,\\,m=4,\\,d=3$, we find:<br \/>\n\\[<br \/>\nS_1 = 0.065573\\%<br \/>\n\\]To compute $S_2$, we must compute the probability of the intersection of two events, i.e. the probability that two patches are simultaneously mono-colored. There are two possibilities:<br \/>\n\\[<br \/>\n\\mathbb{P}(X_i \\cap X_j) = \\begin{cases}<br \/>\n\\frac{1}{d^{2(m^2-1)}} &#038; \\begin{array}{l}\\text{if the patches don&#8217;t overlap}\\end{array}\\\\<br \/>\n\\frac{1}{d^{c-1}} &#038; \\begin{array}{l}<br \/>\n\\text{if the patches overlap and}\\\\<br \/>\n\\text{together cover }c\\text{ cells in total}<br \/>\n\\end{array}<br \/>\n\\end{cases}<br \/>\n\\]This follows because when patches don&#8217;t overlap, the events are independent, so the probability that both patches are mono-colored is the product of probabilities that each is mono-colored. When the patches overlap, however, they can only both be mono-colored if they are both the <em>same<\/em> color! And here, the probability depends on how much the patches overlap. I wrote a simple script that loops over all pairs of patches, computes the above probabilities, and sums them up. The result was:<br \/>\n\\[<br \/>\nS_2 = 0.0017071\\%<br \/>\n\\]We could continue in this fashion, computing $S_3,S_4,\\dots$ but this is a hopeless endeavor as we must account for the probability that $k$ patches are simultaneously mono-colored&#8230; which will get messy in a hurry. Instead, we&#8217;ll use an additional piece to the inclusion-exclusion principle, called the <a href=\"https:\/\/en.wikipedia.org\/wiki\/Boole%27s_inequality#Bonferroni_inequalities\">Bonferroni inequality<\/a>, which states that as we add and subtract the $S_k$ terms on our way to computing $\\mathbb{P}(X)$, they alternate between providing upper and lower bounds!. This means that we can use $S_1$ and $S_2$ to compute an approximation:<br \/>\n\\[<br \/>\nS_1-S_2 \\le \\mathbb{P}(X) \\le S_1<br \/>\n\\]Substituting the numbers we computed above, we obtain:<\/p>\n<p style=\"text-align: center;\"><span style=\"background-color: #AFC8E6; padding: 25px 10px 25px 10px;\">$\\displaystyle<br \/>\n0.063866\\% \\lt \\mathbb{P}(\\text{rug is rejected}) \\lt 0.065573\\%<br \/>\n$<\/span><\/p>\n<p>It&#8217;s reasonable to expect that $S_k$ will continue to decrease as $k$ increases. If we assume that the $S_k$ will decrease geometrically, that would give us $S_3 \\approx 0.000044\\%$, which would tighten our bound to $0.06387 \\lt \\mathbb{P}(X) \\lessapprox 0.06391\\%$. So if I had to guess a single number, it would be that the probability of rejection is about $0.0639\\%$.<\/p>\n<p>For the second part of the problem, we are asked about how many colors we should use in order to be ensured that we can manufacture one million rugs and have no rejections. Each rug is independently manufactured, so if we call $p$ the probability of one rug being rejected, the probability that no rugs get rejected after one million rugs are made is:<br \/>\n\\[<br \/>\n\\mathbb{P}(\\text{no rejections}) = (1-p)^{1,000,000}<br \/>\n\\]Since we don&#8217;t know the exact value of $p$, we can again compute upper and lower bounds. Since we want a conservative estimate of how many colors to use, it makes sense to <em>under-estimate<\/em> the probability of having no rejections. This amounts to using our <em>upper bound<\/em> for $p$. Here is a plot showing the probability of no rejections in the entire batch, as a function of $d$, the number of colors used.<\/p>\n<p><a href=\"https:\/\/laurentlessard.com\/bookproofs\/wp-content\/uploads\/2018\/08\/Riddler_Rugs.png\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/laurentlessard.com\/bookproofs\/wp-content\/uploads\/2018\/08\/Riddler_Rugs-1024x681.png\" alt=\"\" width=\"840\" height=\"559\" class=\"aligncenter size-large wp-image-2395\" srcset=\"https:\/\/laurentlessard.com\/bookproofs\/wp-content\/uploads\/2018\/08\/Riddler_Rugs-1024x681.png 1024w, https:\/\/laurentlessard.com\/bookproofs\/wp-content\/uploads\/2018\/08\/Riddler_Rugs-300x199.png 300w, https:\/\/laurentlessard.com\/bookproofs\/wp-content\/uploads\/2018\/08\/Riddler_Rugs-768x510.png 768w, https:\/\/laurentlessard.com\/bookproofs\/wp-content\/uploads\/2018\/08\/Riddler_Rugs.png 1103w\" sizes=\"auto, (max-width: 709px) 85vw, (max-width: 909px) 67vw, (max-width: 1362px) 62vw, 840px\" \/><\/a><\/p>\n<p>Remember that these are <em>lower bounds<\/em> on the probability that no rugs get rejected. The true probability will be higher. Based on this fact, we&#8217;ll likely be safe if we pick $6$ colors.<\/p>\n<p>Here are the probabilities in tabular form, and I also included the corresponding upper bound using the Bonferroni inequality discussed earlier.<\/p>\n<table style=\"width:75%\">\n<tr>\n<th>Colors<\/th>\n<th>lower bound on probability<\/th>\n<th>upper bound on probability<\/th>\n<\/tr>\n<tr>\n<td>$3$<\/td>\n<td>$0$<\/td>\n<td>$3.9\\times 10^{-8}$<\/td>\n<\/tr>\n<tr>\n<td>$4$<\/td>\n<td>$0.01564\\%$<\/td>\n<td>$93.320\\%$<\/td>\n<\/tr>\n<tr>\n<td>$5$<\/td>\n<td>$73.4684\\%$<\/td>\n<td>$99.901\\%$<\/td>\n<\/tr>\n<tr>\n<td>$6$<\/td>\n<td>$98.0188\\%$<\/td>\n<td>$99.996\\%$<\/td>\n<\/tr>\n<tr>\n<td>$7$<\/td>\n<td>$99.802\\%$<\/td>\n<td>$100\\%$<\/td>\n<\/tr>\n<tr>\n<td>$8$<\/td>\n<td>$99.973\\%$<\/td>\n<td>$100\\%$<\/td>\n<\/tr>\n<tr>\n<td>$9$<\/td>\n<td>$99.9954\\%$<\/td>\n<td>$100\\%$<\/td>\n<\/tr>\n<tr>\n<td>$10$<\/td>\n<td>$99.9991\\%$<\/td>\n<td>$100\\%$<\/td>\n<\/tr>\n<\/table>\n<p>The upper bound makes a big jump when we add a fourth color, and as discussed previously, this bound is likely the tighter of the two. So although we argued for 6 colors before, this is probably overly conservative and we could get away with using 5 colors.<\/p>\n<\/div>\n<p><!--\nHere is my solution:\n<a href=\"javascript:Solution('soln_riddler_rugs','toggle_riddler_rugs')\" id=\"toggle_riddler_rugs\">[Show Solution]<\/a>\n\n\n<div id=\"soln_riddler_rugs\", style=\"display: none\">\n\nIn order to compute an exact probability of a rug being rejected, we would have to count the total number of defective rugs and divide it by the total number of rugs. This turns out to be very difficult, so we'll take an approximate approach instead to get a rough estimate of the probability. Let's assume the rug is $n\\times n$, there are $d$ distinct color choices, and we're looking to avoid $m\\times m$ same-color patches. Let's assume the rug has its lower-left corner at the origin and its cells are indexed as $(i,j)$ with $1 \\le i,j \\le n$. Let's define the following indicator random variable:\n\\[\nX_{ij} = \\begin{cases}\n1 & \\begin{array}{l}\\text{if there is a }m\\times m \\text{ same-color patch}\\\\\\text{with its lower-left corner at }(i,j)\\end{array} \\\\\n0 & \\text{otherwise} \\end{cases}\n\\]The sum $X = \\sum_{i,j} X_{ij}$ is the total number of same-color patches. We reject the rug if $X \\ge 1$. Now computing the probability that $X\\ge 1$ is quite challenging, but computing its expectation is straightforward if we use <a href=\"https:\/\/brilliant.org\/wiki\/linearity-of-expectation\/\">linearity of expectation<\/a>. Here is the calculation:\n\\[\n\\mathbb{E}[X] = \\mathbb{E}\\left[ \\sum_{i,j} X_{ij} \\right] = \\sum_{i,j} \\mathbb{E}[X_{ij}] = (n-m+1)^2 \\mathbb{E}[X_{ij}]\n\\]This happens because each expectation is the same (all cells are chosen uniformly at random) and there are $(n-m+1)^2$ different possible lower-left corners for a square patch of size $m\\times m$. To compute $\\mathbb{E}[X_{ij}]$, this is simply the probability that a particular $m\\times m$ patch is mono-colored. Each color occurs with probability $1\/d$ and there are $m^2$ cells. So the probability that all cells are of a particular color is $1\/d^{m^2}$, and we multiply by $d$ to account for the $d$ different colors. Putting everything together,\n\\[\n\\mathbb{E}[X] = \\frac{(n-m+1)^2}{d^{m^2-1}}\n\\]Note that there were no approximations made! Linearity of expectation holds even though the different $X_{ij}$ are correlated! Nevertheless, we haven't actually computed the answer. We computed the expected number of defects in a given rug, not the probability that a rug is defective. In order to compute the actual probability, we would need to know the distribution (which is hard). Instead, we'll use <a href=\"https:\/\/en.wikipedia.org\/wiki\/Markov%27s_inequality\">Markov's inequality<\/a> to approximate the probability. Markov's inequality states that for any nonnegative random variable $X$ and any $a>0$, we have:\n\\[\n\\mathbb{P}(X \\ge a) \\le \\frac{1}{a} \\mathbb{E}[X]\n\\]Setting $a=1$, we obtain the bound:\n\n\n\n<p style=\"text-align: center;\"><span style=\"background-color: #AFC8E6; padding: 25px 10px 25px 10px;\">$\\displaystyle\n\\mathbb{P}(\\text{rug is rejected}) \\le \\frac{(n-m+1)^2}{d^{m^2-1}}\n$<\/span><\/p>\n\n\n\nMarkov's inequality is a tail bound that holds for <em>any<\/em> probability distribution. Therefore, the answer we obtained above is a conservative estimate on the probability of rejection. Plugging in the problem data $n=100$, $m=4$, $d=3$, we find that the probability of a rug being rejected is at most $0.0006557$, or about $0.066\\%$.\n\nFor the second part of the problem, we are asked about how many colors we should use in order to be ensured that we can manufacture one million rugs and have no rejections. Each rug is independently manufactured, so if we call $p$ the probability of one rug being rejected, the probability that no rugs get rejected after one million rugs are made is:\n\\[\n\\mathbb{P}(\\text{no rejections}) = (1-p)^{1,000,000}\n\\]Here is a plot showing the probability of no rejections in the entire batch, as a function of $d$, the number of colors used.\n\n<a href=\"https:\/\/laurentlessard.com\/bookproofs\/wp-content\/uploads\/2018\/08\/Riddler_Rugs.png\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/laurentlessard.com\/bookproofs\/wp-content\/uploads\/2018\/08\/Riddler_Rugs-1024x681.png\" alt=\"\" width=\"840\" height=\"559\" class=\"aligncenter size-large wp-image-2395\" srcset=\"https:\/\/laurentlessard.com\/bookproofs\/wp-content\/uploads\/2018\/08\/Riddler_Rugs-1024x681.png 1024w, https:\/\/laurentlessard.com\/bookproofs\/wp-content\/uploads\/2018\/08\/Riddler_Rugs-300x199.png 300w, https:\/\/laurentlessard.com\/bookproofs\/wp-content\/uploads\/2018\/08\/Riddler_Rugs-768x510.png 768w, https:\/\/laurentlessard.com\/bookproofs\/wp-content\/uploads\/2018\/08\/Riddler_Rugs.png 1103w\" sizes=\"auto, (max-width: 709px) 85vw, (max-width: 909px) 67vw, (max-width: 1362px) 62vw, 840px\" \/><\/a>\n\nHere are the probabilities in tabular form:\n\n\n\n<table style=\"width:60%\">\n  \n\n<tr>\n    \n\n<th>Colors<\/th>\n\n\n    \n\n<th>Probability<\/th>\n\n\n  <\/tr>\n\n\n  \n\n<tr>\n    \n\n<td>$4$<\/td>\n\n\n    \n\n<td>$0.01564\\%$<\/td>\n\n\n  <\/tr>\n\n\n  \n\n<tr>\n    \n\n<td>$5$<\/td>\n\n\n    \n\n<td>$73.4684\\%$<\/td>\n\n\n  <\/tr>\n\n\n  \n\n<tr>\n    \n\n<td>$6$<\/td>\n\n\n    \n\n<td>$98.0188\\%$<\/td>\n\n\n  <\/tr>\n\n\n  \n\n<tr>\n    \n\n<td>$7$<\/td>\n\n\n    \n\n<td>$99.802\\%$<\/td>\n\n\n  <\/tr>\n\n\n  \n\n<tr>\n    \n\n<td>$8$<\/td>\n\n\n    \n\n<td>$99.973\\%$<\/td>\n\n\n  <\/tr>\n\n\n  \n\n<tr>\n    \n\n<td>$9$<\/td>\n\n\n    \n\n<td>$99.9954\\%$<\/td>\n\n\n  <\/tr>\n\n\n  \n\n<tr>\n    \n\n<td>$10$<\/td>\n\n\n    \n\n<td>$99.9991\\%$<\/td>\n\n\n  <\/tr>\n\n\n<\/table>\n\n\n\nRemember that the $p$ we used in these calculations (the probability of a rug being rejected) was an <em>upper bound<\/em>, which means the true value is smaller. In computing the probability of having no defective rugs, we're using $(1-p)$, which means the probabilities in the table above are <em>lower bounds<\/em> (the true probabilities are higher!). Given that the Markov inequality is typically quite conservative, we'll probably be safe if we pick $6$ colors, but we can pick $7$ or $8$ if we're paranoid.\n--><\/p>\n","protected":false},"excerpt":{"rendered":"<p>This Riddler puzzle is about rug manufacturing. How likely are we to avoid defects if manufacture the rugs randomly? A manufacturer, Riddler Rugs\u2122, produces a random-pattern rug by sewing 1-inch-square pieces of fabric together. The final rugs are 100 inches by 100 inches, and the 1-inch pieces come in three colors: midnight green, silver, and &hellip; <a href=\"https:\/\/laurentlessard.com\/bookproofs\/rug-quality-control\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Rug quality control&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":2395,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"om_disable_all_campaigns":false,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"_uf_show_specific_survey":0,"_uf_disable_surveys":false,"footnotes":""},"categories":[7],"tags":[18,8,2,34],"class_list":["post-2381","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-riddler","tag-linearity-of-expectation","tag-probability","tag-riddler","tag-statistics"],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/laurentlessard.com\/bookproofs\/wp-json\/wp\/v2\/posts\/2381","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/laurentlessard.com\/bookproofs\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/laurentlessard.com\/bookproofs\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/laurentlessard.com\/bookproofs\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/laurentlessard.com\/bookproofs\/wp-json\/wp\/v2\/comments?post=2381"}],"version-history":[{"count":16,"href":"https:\/\/laurentlessard.com\/bookproofs\/wp-json\/wp\/v2\/posts\/2381\/revisions"}],"predecessor-version":[{"id":2402,"href":"https:\/\/laurentlessard.com\/bookproofs\/wp-json\/wp\/v2\/posts\/2381\/revisions\/2402"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/laurentlessard.com\/bookproofs\/wp-json\/wp\/v2\/media\/2395"}],"wp:attachment":[{"href":"https:\/\/laurentlessard.com\/bookproofs\/wp-json\/wp\/v2\/media?parent=2381"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/laurentlessard.com\/bookproofs\/wp-json\/wp\/v2\/categories?post=2381"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/laurentlessard.com\/bookproofs\/wp-json\/wp\/v2\/tags?post=2381"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}