Optimize table layout

algorithmsoptimization

I have a set of cells arranged in a table. I need to minimize the table's height by adjusting column widths.

Each cell has an area such that its area is not encroached upon as its width and height are adjusted. In other words, given a final row height $h_i$, final column width $w_j$, and initial cell area $a_{ij} \in A$, this must hold true:

$$ \forall a_{ij} \in A : a_{ij} \leq h_i \cdot w_j $$

Since it's a table, each cell in a column has the same width and each cell in a row has the same height. Additionally, each row has the same width which is a chosen parameter $W$ (the width of the table). Thus:

$$ W = \sum_j w_j $$

…and the table will have this overall height:

$$ H = \sum_i h_i $$

So given $A$ (and knowing its dimensions), I need to compute all $w_j$ such that $H$ is minimized.

Minimum height of two-column table

Consider a two-column table with cell areas like the below. For simplicity, the table has a total width of 1. $p$ is the width of the first column; $1-p$ is the width of the second column; and column widths cannot be zero (so $0 < p < 1$):

  p  1-p
|<->|<->|

+---+---+
| a | b |
+---+---+
| c | d |
+---+---+
| e | f |
+---+---+
|...etc |

The height of the first row will be:
$$ \cases{
p \leq \frac{a}{a+b} : \frac{a}{p}
\\ p > \frac{a}{a+b} : \frac{b}{1 – p}
} $$

…and of the second:
$$ \cases{
p \leq \frac{c}{c+d} : \frac{c}{p}
\\ p > \frac{c}{c+d} : \frac{d}{1 – p}
} $$

…and so on. Notice how the left cell's area is considered (with one denominator) when $p$ is small enough; otherwise the right cell's area is used (with a different denominator).

Let's suppose that things are such that for a given $p$ these cells' areas are used: $(
a, d, e, \ldots )$. This would be the table's height:
$$ \frac{a}{p} + \frac{d}{1 – p} + \frac{e}{p} + \ldots $$

Let's take a moment to generalize this. Add up all the areas chosen from the left side and call that $l$, and $r$ for all areas from the right side. Thus:
$$ H = \frac{l}{p} + \frac{r}{1 – p} $$

Now we want to minimize the table's height by finding the best $p$. So take the derivative and set it to zero:
$$ 0 = \frac{d}{dp} H = \frac{r}{(1-p)^2} -\frac{l}{p^2} $$
$$ = r \cdot p^2 – l \cdot (1 – p)^2 $$
$$ = (r – l) \cdot p^2 + 2l \cdot p – l $$

Here are the solutions to this quadratic equation:
$$ p = \cases{
l \neq r : \frac{-2l \pm \sqrt{4l^2 +4l(r-l)}}{2(r – l)}
\\l = r : 0.5
}$$

Plug each of the solved $p$ back into $H$ to figure out which is best.

So now all you have to do is decide, for a given range of $p$, which cells contribute to $l$ and which cells contribute to $r$, and then use the above equations. The best table height from all ranges of $p$ is the global minimum table height.

I say "for a given range of $p$" because for every row we know the range of $p$ for which the left cell should be considered versus the right cell. For example, we know that cell $a$ should be added to $l$ when $p \leq \frac{a}{a + b}$. That means the first row contributes two possible ranges of $p$ that need to be checked (and $\frac{a}{a + b}$ is the boundary). The second row contributes another two possible ranges (with the boundary at $\frac{c}{c + d}$). And so on. In each range different cell areas are contributing to $l$ and the rest are contributing to $r$.

In other words, if there are $x$ table rows then there are up to $2x$ different equations for $H$ that you need to solve to find the minimum height of a two-column table.

But I do not know how to generalize this into more columns

False starts

#1

Here's an algorithm which at first glance might seem to do the trick. But it only works for certain table configurations. For example, this does not work when the diagonal cells begin as "king" cells.

Lay out the table so that all rows are tightly stacked (meaning no row exists in which all cells in that row have elbow room). At this point it doesn't matter how wide the table is. As a consequence of this some columns will be too wide
Select the first column
For every cell in the column, calculate the amount the column can be shrunk $\Delta w = w_y – a_i / h_x$ such that this cell will have no elbow room
Find the minimum $\Delta w > 0$ (if any) of the column
Shrink the column by that amount
Select the next column and repeat from #3 until all columns have been adjusted
Scale the table to the desired width, preserving relative column proportions
Recalculate row heights based on the final column widths

This comes from the intuition that when a table's rows all have minimum height then each row will have at least one "king" cell which has no elbow room and will only increase that row's height if its column is collapsed further. Therefore the table has to get taller if any "king" cell's column is shrunk. But that only covers columns in which a "king" cell is present. The goal of this algorithm is to get "king" cells in all columns.

Once there's a "king" cell in each row and in each column then one would think that no column can be shrunk without a net increase in table height. One would think that increasing a row's height cannot be compensated by a decrease in another row's height because one would think all other rows already have minimum height.

But that's an incorrect intuition. While it may be true that no column may be shrunk in isolation, there may still exist the possibility to alter the widths of several columns together in such a way that the total table height is reduced.

Regardless, I do believe it's the case that the optimal column widths are still optimal when scaled together. So I believe steps 7 and 8 are valid.

To illustrate why this algorithm does not work, consider this 2×2 table:

+---+---+
| a |   |
+---+---+
|   | b |
+---+---+

In this case, the table has two empty cells on one diagonal and two populated cells on the other diagonal. Thus these two cells are guaranteed to be king cells, and the algorithm will traverse the columns without altering anything. In other words, the original column arrangement (whatever that happens to be) is the final column arrangement. The algorithm does nothing but push the problem of optimizing table layout elsewhere.

In this specific case it's possible to demonstrate that the ideal ratio of first column width to second column width is $\sqrt{a} : \sqrt{b}$. Yet this isn't the ideal ratio for all tables. So the problem remains unsolved in general.

#2

Given that the optimal column distribution for a two-column table may be found in O(rows^2) time (see above), I was hopeful for an easy way to append columns. But this doesn't appear to be feasible.

To illustrate this, consider this optimal table (roughly at scale):

+-+-------------+
|1|             |
+-+-------------+
| |             |
| |             |
| |     169     |
| |             |
| |             |
+-+-------------+

Since it's optimal, the first row height is $\sqrt{1} / \sqrt{169} = 7.7\%$ of the table height.

What happens when we append the following column to it?

+-----+
| 1e6 |
+-----+
|  0  |
+-----+

169 is peanuts compared to 1e6. And what are we going to do—place it in a row that's only 7.7% of the total table height while the other 92.3% goes to the cell with 169? Of course not! We'll want to give the second column proportionally more space so that it gets shorter and the 1e6 can grow taller/skinnier. And as the 1e6 grows taller we can give the first column proportionally less space (so that the height of the 1 is equal to the height of the 1e6).

In other words, appending a column requires laying out the entire table again. Meaning to lay out a three-column table you need to know how to lay out a three-column table. That doesn't really get us anywhere. And for the general case I think that would work out to O(rows^2 * columns!) time complexity.

Best Answer

I tried to implement Rahul's suggestion to view it as a convex optimization problem. The results are mixed. I can easily do small tables like 30 by 30, but 300 by 300 can be done with only about 1% precision if you are willing to wait 1 minute and getting down from there will take eternity. That is primarily due to the inefficiency of the solution finder I'm using (which is more or less just cycling over variables and optimizing over certain subsets of them; I wish I could find a better way or, at least, accelerate convergence somewhat). Nevertheless it is a good exercise in convex programming, so I'll post the details here. The algorithm can be modified to take into account "natural" restrictions of the kind $w_j\ge W_j$ or $h_i\ge H_i$ (width/height should not be too small) and the modification has pretty much the same rate of performance as far as I can tell from simulations, but I'll restrict myself to the original question here.

Let $w_j$ be the unknown widths and $a_{ij}$ be the known areas. We want to minimize $\sum_i\max_j \frac{a_{ij}}{w_j}$. It is useful to consider the dual problem. I will spare you from the general theory of duality and will just note that $$ \max_j \frac{a_{ij}}{w_j}=\max\left\{\sum_j b_{ij}\frac{a_{ij}}{w_j}:b_{ij}\ge 0, \sum_j b_{ij}=1\right\} $$ so if we consider all admissible vectors $w=(w_1,\dots,w_n)$ (non-negative entries, total sum $1$) and all admissible matrices $b=(b_{ij})$ (non-negative entries, all row sums equal to $1$), we can write our problem as that of finding $$ \min_w\max_b \sum_{i,j} b_{ij}\frac{a_{ij}}{w_j}\,. $$ The dual problem to that is finding $$ \max_b \min_w\sum_{i,j} b_{ij}\frac{a_{ij}}{w_j}\,. $$ The inner $\min_w$ is here easy to find: if we denote $S_j=\sum_i b_{ij}a_{ij}$, then it is just $(\sum_j \sqrt{S_j})^2$ with unique optimal $w_j$ proportional to $\sqrt{S_j}$.

There are two things one should understand about duality. The first one is that every admissible matrix $b$ (computed or just taken from the ceiling) can serve as the certificate of the impossibility to do better than a certain number in the original problem, i.e., the minimax is never less than the maximin. This is pretty trivial: just use the given $b$ to estimate the minimax from below. The second one is that the true value of minimax is actually the same as the true value of maximin (under some mild assumptions that certainly hold in our case). This is a somewhat non-trivial statement.

Together these two observations allow one to use the following strategy. We shall try to solve the dual problem. For every approximation $b$ to the solution, we will look at the easily computable lower bound $(\sum_j\sqrt{S_j})^2$ it produces and at the corresponding minimizer $w$. For that $w$ we can easily compute the original function $\sum_j\max_i\frac{a_{i,j}}{w_j}$. If its value is reasonably close to the lower bound, we know that we should look no further.

Now, of course, the question is how to maximize $\sum_j\sqrt S_j$ under our constraints on $b$. It doesn't look like an attractive problem because the number of unknowns increased from $n$ to $mn$. Still, one can notice that if we fix all rows of $b$ except, say, the $i$'th one, then the optimization of the $i$'th row is rather straightforward. Indeed, the corresponding problem is of the following kind:

**Find $\max\sum_j\sqrt{a_jb_j+c_j}$ where $a_j,c_j\ge 0$ are given and $b_j\ge 0$ are the unknowns subject to the constraint $\sum_j b_j=1$. Using the standard Lagrange multiplier mumbo-jumbo, we conclude that the optimal $b_j$ must satisfy the equations $\frac{a_{j}}{\sqrt{a_jb_j+c_j}}=\lambda$ whenever $b_j>0$ and the inequalities $\frac{a_{j}}{\sqrt{a_jb_j+c_j}}\le \lambda$ whenever $b_j=0$. Thus, the optimizer is just a vector $b_j=\max(\Lambda a_{j}-\frac{c_j}{a_j},0)$ with an unknown $\Lambda=\frac 1{\lambda^2}>0$ that should be found from the constraint $\sum_j b_j=1$. This is a one-variable equation for the root of a monotone function, so it can be easily solved in various ways.

Thus, we can optimize each row of $b$ with other rows fixed rather quickly. The natural idea is then just to cycle over rows optimizing each one in turn. It takes about 20 full cycles to get the lower bound and the value of the function within 1% range from each other on a random matrix (structured matrices seem to be even better) up to the size of 300 by 300 at least.

This is the description. The code (in Asymptote) is below.

srand(seconds());

int m=50, n=55;

real[][] a, b;
for(int i=0;i<m;++i)
{
    a[i]=new real[]; b[i]=new real[];
    for(int j=0; j<n; ++j)
    {
        a[i][j]=unitrand();
        b[i][j]=1/n;
    }
    //a[i][rand()%n]=2;
    a[i]*=unitrand();
}

real[] p, S;

for(int k=0;k<101;++k)
{
    for(int j=0;j<n;++j)
    {
        real s=0;
        for(int i=0;i<m;++i)
            s+=a[i][j]*b[i][j];
        S[j]=s;
        p[j]=sqrt(S[j]);
    }
    if(k%10==0)
    {
        write("*** Iteration "+string(k)+" ***");
        write(sum(map(sqrt,S))^2);
    }

    p/=sum(p);

    real s=0; 
    for(int i=0;i<m;++i)
    {
        real M=0; 
        for(int j=0;j<n;++j)
        {
            real h=a[i][j]/p[j];
            if(h>M)
                M=h;
        }
        s+=M;
    }
    if(k%10==0)
        write(s);
    //pause();

    for(int i=0;i<m;++i)
    {
        real[] A,V,C,B;
        for(int j=0;j<n;++j)
        {
            A[j]=a[i][j];
            V[j]=S[j]-a[i][j]*b[i][j];
            C[j]=V[j]/a[i][j];
        }
        real aa=(sum(C)+1)/sum(A);
        real da=1;
        while(da>1/10^10)
        {
            for(int j=0;j<n;++j)
            {
                B[j]=aa*A[j]-C[j];
                if(B[j]<0)
                {
                    A[j]=0;
                    B[j]=0;
                }
            }
            da=sum(B)-1; aa-=da/sum(A); 
        }
        for(int j=0;j<n;++j)
        {
            b[i][j]=B[j];
            S[j]=V[j]+a[i][j]*B[j];
        }
    }
}

write("************");

pause();

Related Solutions

Ratio of maximum sums in table

What is the largest possible ratio $\frac{X}{Y}$?

Short answer: 8/7 based on the matrix $$\begin{pmatrix}0.5 & 0.5 & 0 \\ \frac{1}{3} & \frac{1}{3} & \frac{1}{3}\end{pmatrix}.$$ for which you get $X=4/3$ and $Y=7/6$ (by coloring the (1,1), (2,2) and (2,3) elements).

How I found this: consider the matrix $A\in\mathbb{R}_+^{2\times k}$ with elements $a_{ij}\geq 0$. We can always permute the columns in a way that the largest numbers are in the first row for the first set of columns, and in the second row for the last set of columns. Let $p$ denote the number of columns for which the maximum is in the first row. We will consider $p=1,2,\ldots,\left \lfloor{k/2}\right \rfloor$ (due to symmetry there is no need to consider larger $p$). We have: $$X=\sum_{j=1}^p A_{1j} + \sum_{j=p+1}^k A_{2j}.$$ To further reduce symmetry, we can assume that the first $p$ columns and the last $k-p$ columns of $A$ are both sorted high to low in the first row. This symmetry breaking rule is not necessary, but helps in terms of speed (the first one is necessary for efficiently setting $X$). Let $C \subset \{0,1\}^{2 \times k}$ denote the set of $2^k$ possible colorings: for $c\in C$, $c_{ij} = 1$ if element $(i,j)$ is colored, and 0 otherwise. The following inequalities set the highest lower bound on $Y$ by enumerating all valid colorings: $$\forall c \in C : (Y \geq \sum_{ij} a_{ij} c_{ij}) \vee \sum_{j} a_{1j} c_{1j} < a_{1j} (1-c_{1j}) - a_{1j'} \forall j' \vee \sum_{j} a_{2j} c_{2j} < a_{2j} (1-c_{2j}) - a_{2j'} \forall j',$$ where $j'$ has to be an uncolored element $(c_{ij'}=0)$.

For fixed $k$, $p$ and $X$, you can minimize $Y$ via mixed integer optimization (matlab/YALMIP code below). It uses a big-M formulation for the three alternatives above. Here is the output for $k=4$:

If you run the code with $k$ larger than 4, you just get extra columns that are zero. Zooming in (generating more data points) shows a maximum at $X=4/3$.

k=5;
eps = 0.0001;
y = [];
for x = linspace(1,1.5,50)
    opt = 0;
    for p = 1:floor(k/2)
        yalmip('clear');
        A = sdpvar(2,k);
        Y = sdpvar(1,1);
        % nonnegative and row sum 1
        F = [A >= 0];
        F = F + [sum(A,2) == 1];
        % break symmetry
        F = F + [A(1,1:p) >= A(2,1:p)];
        F = F + [A(1,p+1:k) <= A(2,p+1:k)];
        F = F + [A(1,1:p-1) >= A(1,2:p)];
        F = F + [A(1,p:k-1) >= A(1,p+1:k)];
        % set X
        X = sum(A(1,1:p)) + sum(A(2,p+1:k));
        F = F + [X == x];
        % iterate over all colorings
        for c = (1 + dec2bin(1:2^k-2) - '0')'
            c = full(sparse(c,1:k,ones(k,1),2,k));
            % one out of three constraints
            one_out_of_three = binvar(3,1);
            F = F + [sum(one_out_of_three) == 1];
            F = F + [Y >= sum(sum(c.*A)) - k*(1-one_out_of_three(1))];
            sum_colored = sum(c.*A,2);
            sum_uncolored = sum((1-c).*A,2);
            for i = 1:2
                for j = find(c(i,:)==0)
                    F = F + [sum_colored(i) + 0.001 <= sum_uncolored(i) - A(i,j) + k*(1-one_out_of_three(1+i))];
                end
            end
        end
        optimize(F,Y, sdpsettings('verbose', 0));
        fprintf('Maximum ratio for X=%f, k=%d, p=%d: %f\n', x, k, p, double(X/Y));
        opt = max(opt, double(X/Y));
    end
    y = [y; opt];
end
plot(linspace(1,1.5,50), y, '*');
xlabel('X');
ylabel('max X/Y');

Can we generate a valid $9\times 9$ sudoku using this algorithm

The answer is no, because it is possible to set up situations where the exact value of certain cells is not known, but they still restrict the possible values in other cells.

A simple example is something known as a "naked pair". This is a pair of cells in the same row, column or box that share the same two candidate values. For example, suppose that the cells in r1c1 and r2c1 have both been reduced to containing only the values 3 and 5, but it is not known which order.

Whichever value r1c1 takes, r2c1 must take the other - if one is a 3, the other must be a 5 and vice versa. As a result, both 3 and 5 can be eliminated as candidates from every other cell in column 1 (and in the top-left box, in fact). However, your algorithm does not do this, and so it is feasible that it might pick, for example, r8c1 and try to set it as a 5 which will break the sudoku.