Connection Between Two Lexicographic Rules in Simplex Method

linear programmingoptimizationsimplex method

In my study of the simplex method, I have come across two formulations of the lexicographic pivoting rule. Currently I am having trouble understanding how they are connected to each other.

  1. In Section 3.4 of Introduction to Linear Optimization by Bertsimas and Tsitsiklis, the lexicographic rule is applied to the simplex tableau $B^{-1}[b\mid A]$ of the standard form problem
    $$
    \begin{align}
    \min&\quad c^\top x \\
    \text{s.t.}&\quad Ax=b \\
    &\quad x\ge 0.
    \end{align}
    $$
  2. Exercises 3.15 and 3.16 of the book present a second formulation of the lexicographic rule, which works on $B^{-1}[b\mid I]$. This approach is equivalent to solving the $\epsilon$-perturbed problem. The same formulation is also used in Chvátal's Linear Programming and Vanderbei's Linear Programming.

My question: How are the two formulations connected?

My understanding: In the proofs of the finite termination of the simplex method with lexicographical rule, a crucial assumption is that the initial basis matrix $B$ satisfies that all rows of $B^{-1}[b\mid A]$ or $B^{-1}[b\mid I]$ are lexicographically positive. Under this assumption, the proofs of the finite termination have the same spirit, namely, (1) the rows' lexicographical positiveness is maintained in each iteration, and (2) a special quantity that is determined completely by the basis strictly increases lexicographically, hence each basis is visited at most once.

While this assumption does not hold in general, Chvátal's book and Vanderbei's book get around this by working with a sepcial form
$$
\min\quad c^\top x\quad
\text{s.t.}\quad Ax\le b,\quad x\ge0
$$

with $b\ge0$. Then by adding the slack variables $s$ which gives
$$
\min\quad c^\top x\quad
\text{s.t.}\quad Ax+s=b,\quad x,s\ge0,
$$

and by choosing $s$ to be the basic variables, we have a very easy initial basis matrix, namely, $B=I$ which has all rows lexicographically positive. In contrast, Bertsimas and Tsitsiklis' book uses a different approach: by reordering the variables, we can always assume that the first $m$ columns of $B^{-1}A$ form the identity matrix, so every row of $B^{-1}[b\mid A]$ is lexicographically positive.

Is my understanding correct? Any other insights would be highly appreciated.

Best Answer

In the special case where we turn $Ax \le b$ into $Ax + s = b$, there are two equivalent formulations of the lexicographic pivoting rule.

  1. Change $b$ to $b+\epsilon$, where $1 \gg \epsilon_1 \gg \epsilon_2 \gg \dots \gg \epsilon_m > 0$.
  2. Make sure that the columns of the slack variables are first in our matrix: in equational form, we have $$[I \mid A] \begin{bmatrix}s \\ x\end{bmatrix} = b.$$ Then, only allow bases $B$ such that $B^{-1}[b \mid I \mid A]$ is lexicographically positive: in each row, the first nonzero entry is positive.

These are equivalent, because in $B^{-1}[b \mid I \mid A]$, the first $m+1$ entries of each row are exactly the coefficients of $1, \epsilon_1, \epsilon_2, \dots, \epsilon_m$ in $B^{-1}(b+\epsilon)$.

Note that we will never need to look beyond the first $m+1$ entries, because they cannot all be $0$: that could only happen if $B^{-1}I$ had an all-zero row, but that would make it singular, even though it has inverse $B$. So a rule that talks about lexicographic positivity of $B^{-1}[b \mid I]$ is equivalent to a rule that references the entire matrix $B^{-1}[b \mid I \mid A]$.


The same relationship holds more subtly when working with $Ax=b$, except here, we have to wait until we've found a starting feasible basis. Suppose we start the simplex method in basis $B_1$, where $(B_1)^{-1}b \ge 0$. Then the following two strategies are equivalent:

  1. Change $(B_1)^{-1}b$ to $(B_1)^{-1}b+\epsilon$, where $1 \gg \epsilon_1 \gg \epsilon_2 \gg \dots \gg \epsilon_m > 0$. (Or in other words, perturb the original system $Ax=b$ to $Ax = b + B_1 \epsilon$.)
  2. Reorder the columns of $A$ to put $B_1$ first: $A = [B_1 \mid N_1]$. Then, only allow bases $B$ such that the rows of $B^{-1}[b \mid A]$ are lexicographically positive. Note that this holds for $B = B_1$.

These are equivalent, because in $B^{-1}[b \mid A]$, the first $m+1$ entries of each row are the rows of $B^{-1}[b \mid B_1]$, which are exactly the coefficients of $1, \epsilon_1, \dots, \epsilon_m$ in $B^{-1}(b + B_1 \epsilon)$.

Note that we will, again, never need to look beyond the first $m+1$ entries: that could only happen if $B^{-1}B_1$ had an all-zero row, which would make it singular, but it has an inverse $(B_1)^{-1}B$.


I do not see a justification for lexicographic pivoting in the following form: take a system $Ax=b$ with its columns in whatever order we got them in, and only allow bases $B$ such that the rows of $B^{-1}[b \mid A]$ are lexicographically positive.

(It's possible that it works, but in my mind only the perturbation method has an intuitive justification for why it works, and the lexicographic positivity method is justified by being equivalent to the perturbation method. However, that equivalence only holds if we're careful about the column ordering, as above.)

However, this is not an issue, since we can never even get started with the simplex method unless we have an initial feasible basis $B_1$ to use.