Analytically solve linear program with single linear equality constraint (+ bounded requirement)

lagrange multiplierlinear programmingoptimization

Consider the following simple linear program with one equality constraint and a simple set of inequalities bounding the variables:
\begin{align}
\max_{x_1,\dots,x_K} & \sum_{k=1}^K a_kx_k \\
\text{subject to:} \; & \sum_{k=1}^K p_kx_k = b \\
& x_k \in [0,1] \; \forall \; k
\end{align}

My goal is to characterize the set of $\{(x^*_1,\dots,x^*_K)\}$ that achieve the maximum of this program, and I am struggling to do this. In case it helps, $a_k$ are all distinct and $\sum p_k =1, p_k \geq 0$ (all of these constants are known). I'm not interested in numerically solving this, and was wondering if there is a way to analytically identify the set that achieves the max.

Attempt:
I remember from first year of college the lagrangian method, which I believe involves considering
$$\mathcal{L}(x_1,\dots,x_K,\lambda) = \sum_{k=1}^K a_kx_k – \lambda(\sum_{k=1}^K p_kx_k – b)$$
but the gradient is simply
$$a_k – \lambda(b-p_k) = 0 \; \forall \; k$$
which implies that
$$\lambda = \frac{a_k}{b-p_k}$$
and I don't see how that can hold for all $k$ so I must be doing something wrong?

Could someone please advise on how I could go about analytically characterizing the set $\{(x^*_1,\dots,x^*_K)\}$ that achieve the max in this linear objective function/single linear equality constraint setting? I can do this for simple examples, but don't understand how to generalize.

Best Answer

Presumably $0 \le b \le \sum_i p_i$ (otherwise there is no feasible solution).

We may assume the indices $1, \ldots, K$ are sorted in order of decreasing $a_k/p_k$ (where this is taken as $+\infty$ if $p_k = 0$ and $a_k > 0$, and $-\infty$ if $p_k = 0$ and $a_k < 0$). The point is that if you think of $p_k$ as the cost per unit of variable $x_k$ and $a_k$ as the return per unit, $a_k/p_k$ is the return per unit spent on $x_k$. The optimal solution is to spend as much as possible on the items that give you the best return per unit spent. Thus if $\sum_{i=1}^{k-1} p_i \le b < \sum_{i=1}^{k} p_i$, you take $x_i = 1$ for $i \le k-1$, $x_i = 0$ for $i > k$, and $x_k = \left(b - \sum_{i=1}^{k-1} p_i\right)/p_k$.

EDIT: The problem with your "lagrangian method" is that it doesn't take into account the bounds $0 \le x_i \le 1$. If you take those bounds into account, you essentially have the dual linear programming problem.

The dual linear programming problem here is $$ \eqalign{\text{minimize}\ & b y + \sum_{i=1}^k \xi_k\cr \text{subject to}\ & p_i y + \xi_i \ge a_i \ \forall i\cr & \xi_i \ge 0 \ \forall i}$$ The optimal solution should have $\xi_i = 0$ for $i \ge k$ with $p_i y + \xi_i = a_i$ for $i \le k$, thus $y = a_k/p_k$. Showing that this gives you a feasible solution of the dual problem, and satisfies complementary slackness with my solution of the original problem, you can conclude that these solutions are optimal.

The Projection onto The Constrain

Let's define a vector $ z = \left[ {{x}_{1}}^{T}, {{x}_{2}}^{T}, \ldots, {{x}_{K}}^{T} \right]^{T} $.
The constrains is equivalent of:

$$ S * z = \boldsymbol{1}_{n}, \; S = \underset{\times K}{\left [ \underbrace{{I}_{n}, {I}_{n}, \ldots, {I}_{n}} \right ]} $$

Basically the matrix $ S $ sum over the $ i $ -th element of all $ x $ vectors.

Now, given a vector $ y \in \mathbb{R}^{nk} $, its projection onto the set $ \mathcal{S} = \left\{ x \mid S * x = \boldsymbol{1}_{n} \right\} $ is given by:

$$ \arg \min_{x \in \mathcal{S}} \frac{1}{2} \left\| x - y \right\|_{2}^{2} = y - {S}^{T} \left( S {S}^{T} \right)^{-1} \left( S y - \boldsymbol{1}_{n} \right) $$

Due to the special structure of $ S $ one could see it is equivalent to:

$$ {x}_{i} - \frac{\sum_{i = 1}^{K} {x}_{i} - \boldsymbol{1}_{n}}{K}, \: i = 1, 2, \ldots, k $$

Namely spread the deviation equally on all elements.

Here is the code:

%% Solution by Projected Gradient Descent

mX = zeros([numRows, numCols]);

for ii = 1:numIterations
    for jj = 1:numCols
        mX(:, jj) = mX(:, jj) - (stepSize * ((2 * tA(:, :, jj) * mX(:, jj)) + (paramLambda * mC(:, jj))));
    end
    mX = hProjEquality(mX);
end

objVal = 0;
for ii = 1:numCols
    objVal = objVal + (mX(:, ii).' * tA(:, :, ii) * mX(:, ii)) + (paramLambda * mC(:, ii).' * mX(:, ii));
end

disp([' ']);
disp(['Projected Gradient Solution Summary']);
disp(['The Optimal Value Is Given By - ', num2str(objVal)]);
disp(['The Optimal Argument Is Given By - [ ', num2str(mX(:).'), ' ]']);
disp([' ']);

The full code is in my Mathematics Q2199546 GitHub Repository (Specifically in Q2199546.m).
The code is validated using CVX.

The advantage of adding $\log$ Barrier to solve a Linear program

No, the solutions are not the same. On p. 499, the book is merely presenting an example of a self-concordant function.
When you ask about algorithm complexity for problem (1), which algorithm do you have in mind? At this point in the book's development, it is not clear how to solve problem (1) efficiently. The book will present a strategy that utilizes log-barrier functions in chapter 11.
No, the problems are not equivalent.

You should read chapter 11 (Interior-point methods), and specifically section 11.2, which presents a strategy to minimize $f_0(x)$ subject to the constraints $f_i(x) \leq 0, Ax = b$ by solving a sequence of subproblems of the form \begin{align} \text{minimize} & \quad f_0(x) + \sum_{i=1}^m - (1/t) \log(-f_i(x)) \\ \text{subject to} &\quad Ax = b \end{align} with increasing values of $t$. Newton's method is used to solve each subproblem. As $t$ increases, the solution of the subproblem becomes a better approximation to the solution of the original problem. That is how log-barrier functions are used to solve problems with inequality constraints.

Best Answer

Related Solutions

[Math] Minimizing the Sum of Quadratic Form with Equality Constraint

The Projection onto The Constrain

The advantage of adding $\log$ Barrier to solve a Linear program

Related Question