This Riddler problem is about gerrymandering. How to redraw borders to sway a vote one way or another.

Below is a rough approximation of Colorado’s voter preferences, based on county-level results from the 2012 presidential election, in a 14-by-10 grid. Colorado has seven districts, so each would have 20 voters in this model. In each district, the party with the most votes will win. The districts must be non-overlapping and contiguous (that is, each square in a district must share an edge with at least one other square in the district). What is the most districts that the Red Party could win? What about the Blue Party? (Assume ties within a district are considered wins for the party of your choice.)

Two boards are provided, a test 5×5 grid and the larger 14×10 grid:

Here is a first solution, using some simple logic and intuition:

[Show Solution]

- Red wins each district that contains at least 3 red votes. With 16 total red votes, if we can distribute them 3+3+3+3+4 then Red wins every district. Clearly Red can’t do any better than this!
- Blue wins each district that contains at least 3 blue votes. With 9 total blue votes, we can win at most 3 districts, which happens if we distribute the votes 3+3+3+0+0.

The next question is: are these best-case scenarios achievable using contiguous partitions? It turns out they are! Here are partitions that achieve the most possible Red wins and most possible Blue wins:

Likewise, we can examine the larger case and proceed in a similar manner. This time, there are 51 blue votes and 89 red votes, and we must make 7 districts of 20 votes each.

- Red wins each district that contains at least 10 red votes. With 89 total red votes, there are enough to put 10 in each of the 7 districts and have plenty left over. If we can do this, then Red wins all 7 districts.
- Blue wins each district that contains at least 10 blue votes. With only 51 total blue votes, we can win at most 5 districts and we’ll have one blue vote left over.

It turns out that in this case, partitions achieving these best-case scenarios do exist, and it’s relatively easy to find one using trial-and-error. Here are examples of winning partitions (click to enlarge):

And here is a computational approach, using integer programming:

[Show Solution]

One way to solve this problem numerically is to formulate it as an integer programming problem. The benefit is that it will always find an optimal solution eventually, but the downside is that it may take awhile. My code solved the 5-by-5 case in a matter of seconds, but the 14-by-10 case computed for over 12 hours with no solution, so I gave up. Of course, it may be possible to solve the larger case using this method, but it would require either a more efficient model or a faster computer!

The idea is to think of the grid cells as nodes of a directed graph. We place an edge between two nodes (one in each direction) if the cells are adjacent. Number the cells $1,2,\dots,C$ and number the edges $1,2,\dots,E$. Define the adjacency matrix $A \in \{0,1\}^{C\times C}$, the incidence matrix $B\in \{0,1\}^{C\times E}$, and the vector $v\in\{0,1\}^C$ as follows:

\begin{align}

A_{ij} &= \begin{cases}

1&\text{if there is an edge from cell }i\text{ to cell }j\\

0&\text{otherwise}

\end{cases}\\[2mm]

B_{ij} &= \begin{cases}

1&\text{if edge }j\text{ starts at cell }i\\

-1&\text{if edge }j\text{ ends at cell }i\\

0&\text{otherwise}

\end{cases}\\[2mm]

v_i &= \begin{cases}

1&\text{if cell }i\text{ is blue}\\

0&\text{if cell }i\text{ is red}

\end{cases}

\end{align}Suppose our districts are numbered $1,2,\dots,D$. The first set of variables in our model are $x\in\{0,1\}^{C\times D}$ and $w\in\{0,1\}^D$ and they represent:

\begin{align}

x_{ij} &= \begin{cases}

1&\text{if cell }i\text{ belongs to district }j\\

0&\text{otherwise}

\end{cases}\\[2mm]

w_j &= \begin{cases}

1&\text{if district }j\text{ is a winner}\\

0&\text{otherwise}

\end{cases}

\end{align}We then have several intuitive constraints which we can encode algebraically as linear constraint for our model:

- Each cell belongs to exactly one district:

\[\sum_{j=1}^D x_{ij} = 1\quad\text{ for }i=1,\dots,C\] - Each district contains exactly $S$ cells:

\[\sum_{i=1}^C x_{ij} = S\quad\text{ for }j=1,\dots,D\] - A district wins if and only if there are at least $W$ votes:

\[W w_j \le \sum_{i=1}^C x_{ij} v_i \le (S+1)w_j + (W-1)\quad\text{ for }j=1,\dots,D\]

Unfortunately, these constraints alone don’t force the districts to be contiguous. For this, we must make use of the graph structure. The idea is to set it up as a flow problem. Imagine two new nodes, a universal source and a universal sink. There are edges from the source to every cell, and edges from every cell to the sink. Then imagine a flow that propagates along the edges. The source provides $S$ units of flow (at most one unit of flow per source edge), the sink must accept $S$ units of flow (and it must all occur on a single sink edge), and each cell in the graph conserves flow. These combined constraints ensure that selected cells are path-connected in the graph, which means they are contiguous in the grid. The variables we’ll use are $f \in \mathbb{Z}^{E\times D}$ and $f^\text{out} \in \{0,1\}^{C\times D}$, defined as:

\begin{align}

f_{ij} &= \begin{cases}

1&\text{if edge }i\text{ is used in the flow for district }j\\

0&\text{otherwise}

\end{cases}\\[2mm]

f^\text{out}_{ij} &= \begin{cases}

1&\text{if cell }i\text{ contains the flow to the sink for district }j\\

0&\text{otherwise}

\end{cases}

\end{align}We then have the constraints:

- Each edge has a flow capacity of $S$:

\[

0 \le f_{kj} \le S\quad\text{ for }k=1,\dots,E\text{ and }j=1,\dots,D

\] - At most one sink edge is used to return the flow:

\[

\sum_{i=1}^C f^\text{out}_{ij} = 1\quad\text{ for }j=1,\dots,D

\] - Flow is conserved at each cell (including flow involving source and sink)

\[

\sum_{k=1}^E B_{ik}f_{kj} + S f_{ij} = x_{ij}\quad\text{ for }i=1,\dots,C\text{ and }j=1,\dots,D

\] - If an edge is used, the associated cells must be selected

\[

x_{ij} \ge \alpha \sum_{k=1}^E |B_{ik}| f_{kj}\quad\text{ for }i=1,\dots,C\text{ and }j=1,\dots,D

\]

In the final constraint, $\alpha$ is a constant that must be chosen sufficiently small to be less than 1 for any amount of flow. Since there are at most 6 active edge (4 internal, 1 source, 1 sink), it suffices to choose $\alpha < \frac{1}{5S+1}$.

Finally, our objective function is to minimize (or maximize) the total number of winning districts. In other words, $w_1+\dots+w_D$. That’s it!

I used IJulia, JuMP, and Gurobi to solve the integer programs, and on my laptop it took on the order of seconds for the 5×5 case. Unfortunately, the 14×10 case ran for over 12 hours without terminating. So it seems my approach might not be viable for large versions of this problem. The solution my code found for the 5×5 was:

If you’re interested in seeing my code in full detail, you can view/download the notebook here.

Since ties can be counted to your advantage, you don’t need two different solutions. The best solution for the party with fewer squares is automatically the best for the other party as well since it is to the advantage of both parties to go for maximum ties. Your solution for ‘most blue wins’ also works for ‘most red wins’. Right?

I’m not sure I understand your comment — maximizing the number of districts where blue wins is different from maximizing the number of districts where red wins… Of course, the problems are similar in the sense that we could just swap red and blue and one problem becomes the other… but still for a particular voting map, you definitely get different results depending on whether you maximize red or blue.

I also tried integer programming for this, but I handled the contiguity constraint differently. In each district, I assigned one random county as the “seat” of that district. I introduced the decision variables y_ij to assign the seat of each district (it equals 1 if cell i is the seat of district j). Contiguity was added by requiring each cell to have at least one neighbor that was closer to the district seat. This required me to use quadratic constraints (there were y*x terms), but Gurobi handles that easily and was able to solve the 5-by-5 in less than a second.

Unfortunately, I couldn’t get an answer for the 14-by-10 either, despite running for several hours.

I wish I had tried this before, but I just now tried selecting 7 random cells to be district seats, and it can find an answer in around 5 seconds for most guesses. For bad choices, it keeps running, quickly finding district lines that win 4/7 seats, but spending what I assume would be hours or days proving that those lines are optimal given that choice of seats.

If I had a more rigorous way of pre-selecting the district seats, I would have a pretty robust solver for this problem.