[Math] How to rigorously show that maximum of linear functions is piecewise linear

combinatoricsconvex-analysisoptimizationpiecewise-continuity

Note: This is an extremely basic question, which is why the fact I can't figure out the answer easily concerns me. This problem is from Boyd and Vandenberghe.

Consider $f: \mathbb{R} \to \mathbb{R}$, $$f : x \mapsto \max_{1 \le i \le n} (a_i x + b_i ) $$
Assume that $a_1 \le a_2 \le \dots \le a_n$, and that for each $k$ there exists an $x_k \in \mathbb{R}$ such that $$f (x_k) = a_k x_k + b_k \,, $$
(i.e. none of the constraints are redundant).

Question: How does one show that $f$ is piecewise linear by giving an explicit construction of $f$ including the breakpoints where $f$ equals each of the $a_k x_k + b_k$'s?

Progress so far: As far as I can tell, this is essentially equivalent to a very special case of the vertex enumeration problem — we are given the constraints defining a polyhedron in $\mathbb{R}^2$ (namely the epigraph of $f$) and we want to find the vertices (the breakpoints of $f$).

Here is one important observation:

Claim: $a_1 \le a_2 \le \dots \le a_n$ with strict inequality, i.e. $a_1 < a_2 < \dots < a_n$.

Proof of claim: Assume by means of contradiction that $a_i = a_j$ for some $i \not=j$. Then without loss of generality $b_i \ge b_j$, so that for all $x \in \mathbb{R}$, $$a_i x + b_i \ge a_i x + b_j = a_j x + b_j \,, $$ which means that the $j$'th constraint is redundant, contradiction. $\square$

In the solution manual (p.55 of the pdf), the following claim is made (without proof) that the breakpoints are (for $1 \le i \le n – 1$):
$$ \frac{b_{i+1} – b_i}{a_i – a_{i+1}} \,, $$
which, being the points of intersection of $a_i x + b_i$ and $a_{i+1}x + b_{i+1}$ corresponds to the segments of the piecewise linear function being ordered left-to-right in terms of increasing slope (which makes sense intuitively and visually since the resulting function, as the maximum of convex functions, has to be convex).

I was able to prove most of this in the case that $n=3$ (see community wiki answer below), although even then some of the combinatorial issues are sufficiently unclear to me that I am not sure how to rigorously generalize to arbitrary $n \ge 3$.

Update: I figured out the answer; please see below.

Best Answer

First note that $a_1 < a_2 < \dots < a_n$ (with strict inequality). Assume by means of contradiction that were not the case, i.e. for some $i \not=j$, $a_i = a_j$. Further, assume without loss of generality that $b_i < b_j$, then we would have the following for all $x \in \mathbb{R}$ as a result of the assumption $a_i = a_j$:

$$ a_i x + b_i < a_i x + b_j = a_j x + b_j \,, $$

such that the $i$th function would be redundant, contradicting the hypotheses.

For notational purposes, define for all $i \in [n]$:

$$ f_i := a_i x + b_i \quad \implies f:= \max_{i \in [n]} f_i \,. $$

The hypotheses guarantee us that for each $i \in [n]$, there exists a $x_i \in \mathbb{R}$ such that $f(x_i) = f_i (x_i)$, or in other words the maximum of the $i$ linear functions at $x_i$ is $f_i$. We can write this as the following equations (note the three cases: $i = 1, 1 < i < n, i = n$):

$$\begin{array}{lcccl} \exists x_1 \in \mathbb{R} : & a_1 x_1 + b_1 > a_2 x_1 + b_2 \,, & a_1 x_1 + b_1 > a_3 x_1 + b_3 \,, & \dots & a_1 x_1 + b_1 > a_n x_1 + b_n \\ \vdots & \vdots & \vdots & & \vdots \\ \exists x_i \in \mathbb{R}: & a_i x_i + b_i > a_1 x_i + b_1 \,, & a_i x_i + b_i > a_2 x_i + b_2 \,, & \dots & a_i x_i + b_i > a_n x_i + b_n \\ \vdots & \vdots & \vdots & & \vdots \\ \exists x_n \in \mathbb{R}: & a_n x_n + b_n > a_1 x_n + b_1 \,, & a_n x_n + b_n > a_2 x_n + b_2 \,, & \dots & a_n x_n + b_n > a_{n-1} x_n + b_{n-1} \,. \end{array}$$

Note that the $i$th row of the above inequalities hold for any point such that $f(x) = f_i$, possibly with strict inequality replaced by non-strict inequality at the intersection of two distinct lines (a cutpoint). The inequalities can be rewritten as follows:

$$ \begin{array}{rccccl} (\!a_1\! -\! a_2\!) x_1 > b_2\! -\! b_1 , & \dots & \dots & \dots & \dots & (\!a_1\! -\! a_n\!) x_1 > b_n\! -\! b_1 \\ \vdots & & & & & \vdots \\ (\!a_i\! -\! a_1\!) x_i > b_1 \!-\! b_i , & \dots & (\!a_i\! -\! a_{i-1}\!) x_i > b_{i-1} \!-\! b_i, & (\!a_i\! -\! a_{i+1}\!) x_i > b_{i+1}\! -\! b_i , & \dots & (\!a_i\! -\! a_n\!) x_i > b_n\! -\! b_i \\ \vdots & & & & & \vdots \\ (\!a_n\! -\! a_1\!) x_n > b_1\! -\! b_n , & \dots & \dots & \dots & \dots & (\!a_n\! -\! a_{n-1}\! ) x_n > b_{n-1} \!- \!b_n . \end{array}$$

Using the observation that $a_1 < \dots < a_n$, it follows that for $i < j$, $a_i - a_j < 0$ and $a_j - a_i > 0$. Therefore we can rewrite the above inequalities yet further as follows:

$$\begin{array}{rccccl} x_1 < \frac{ b_2\! -\! b_1}{a_1\! -\! a_2 } , & \dots & \dots & \dots & \dots & x_1 < \frac{b_n\! -\! b_1 }{a_1\! -\! a_n} \\ \vdots & & & & & \vdots \\ x_i > \frac{b_1 \!-\! b_i}{a_i\! -\! a_1} , & \dots & x_i > \frac{b_{i-1} \!-\! b_i}{a_i\! -\! a_{i-1}}, & x_i < \frac{b_{i+1}\! -\! b_i}{a_i\! -\! a_{i+1}} , & \dots & x_i < \frac{b_n\! -\! b_i}{a_i\! -\! a_n} \\ \vdots & & & & & \vdots \\ x_n >\frac{ b_1\! -\! b_n}{a_n\! -\! a_1} , & \dots & \dots & \dots & \dots & x_n > \frac{b_{n-1} \!- \!b_n}{a_n\! -\! a_{n-1}} . \end{array}$$

We rewrite the above slightly so that the denominators are always of the form $a_j - a_i $ for $i< j$ (thus positive), and the numerators are of the form $ b_i - b_j $ for $i < j$, so as to better notice a crucial pattern:

$$\begin{array}{rccccl} x_1 < \frac{ b_1\! -\! b_2}{a_2\! -\! a_1 } , & \dots & \dots & \dots & \dots & x_1 < \frac{b_1\! -\! b_n }{a_n\! -\! a_1} \\ \vdots & & & & & \vdots \\ x_i > \frac{b_i \!-\! b_1}{a_1\! -\! a_i} , & \dots & x_i > \frac{b_i \!-\! b_{i-1}}{a_{i-1}\! -\! a_i}, & x_i < \frac{b_{i+1}\! -\! b_i}{a_i\! -\! a_{i+1}} , & \dots & x_i < \frac{b_n\! -\! b_i}{a_i\! -\! a_n} \\ \vdots & & & & & \vdots \\ x_n >\frac{ b_1\! -\! b_n}{a_n\! -\! a_1} , & \dots & \dots & \dots & \dots & x_n > \frac{b_{n-1} \!- \!b_n}{a_n\! -\! a_{n-1}} . \end{array}$$

Specifically, given $i < j$, the above inequalities show us that:

$$ f_i \ge f_j \iff x \le \frac{b_i - b_j}{a_j - a_i} \,, \quad \quad \quad f_j \ge f_i \iff x \ge \frac{b_i - b_j}{a_j - a_i } \,. $$

Therefore, for each $i,j \in [n]$, $i < j$, we define the constant:

$$ c_{ij} := \frac{b_i - b_j}{ a_j - a_i } \quad \text{with the property that }\quad f_i \ge f_j \iff x \le c_{ij} , \, f_j \ge f_i \iff x \ge c_{ij} \,. $$

For each $1<i <n$, we have in particular that, since there exists an $x_i$ with $\max f_k (x_i) = f_i(x_i)$:

$$ x_i > c_{i-1, i} \,, \quad x_i < c_{i, i+1} \quad \implies \quad c_{i-1,i} < c_{i, i+1} \,. $$

Chaining all of these results together, we get by transitivity that:

$$ c_{12} < c_{23} < \dots < c_{i-1,i} < c_{i, i+1} < \dots < c_{n-1, n} \,. $$

We claim subsequently that for $1 < i <n$, $\max_k f_k(x) = f_i (x)$ if and only if $ c_{i-1,i} \le x \le c_{i,i+1} $, that $\max_k f_k (x) = f_1 (x)$ if and only if $x \le c_{12}$, and $\max_k f_k(x) = f_n(x)$ if and only if $x \ge c_{n-1,n}$.

However, as was mentioned already above, the statement that $\max_k f_k (x) = f_i(x)$ corresponds to $n-1$ inequality constraints. But we claim above that only $2$ inequality constraints are necessary for $1 < i < n$, and only one inequality constraint for $i=1$ or $i=n$. Therefore, in order for the claim to be true, we must show that all of the $c_{i, i+k}$ for $k \ge 2$ are redundant for determining the value of $f$, i.e. the maximum of the $f_i$.

Note: the following arguments are much easier to follow if one draws a picture. (That's how I made them, by first drawing a picture, and then figuring out the formal logic based on the picture.)

We begin by showing that for any $1 \le i \le n-2$,

$$ c_{i, i+1} \le c_{i,i+2} \le c_{i+1, i+2} \,. \quad \quad \tag{*} $$

From this it follows that each of the $c_{i, i+2}$ are redundant for determining the value of $f = \max_k f_k$, since $\max_k f_k (x) = f_{i+1}$ if and only if $c_{i,i+1} \le x \le c_{i+1, i+2}$. By (*) this holds for $c_{i, i+2}$, which means that the relative ``position'' of $f_i$ and $f_{i+2}$ changes (from $f_i \ge f_{i+2}$ to $f_{i+2} \ge f_i$) inside a segment of the real line where the largest function is $f_{i+1}$. Thus this change in relative position makes no difference in determining the value of $f$, and we can ignore the inequality constraints defined w.r.t. the $c_{i,i+2}$ when finding the value of $f$.

Assume by means of contradiction that $c_{i, i+2} < c_{i, i+1} < c_{i+1, i+2}$. Then it follows that

$$ f_i(c_{i,i+2}) > f_{i+1}(c_{i,i+2}) \,, \quad f_{i+1}(c_{i,i+2}) > f_{i+2}(c_{i,i+2}) \quad \quad \implies \quad \quad f_i(c_{i,i+2}) > f_{i+2}( c_{i,i+2}) \,. $$

However, this is an immediate contradiction, since by definition $f_i(c_{i,i+2}) = f_{i+2}(c_{i,i+2})$.

Now assume by means of contradiction that $c_{i, i+1} < c_{i+1, i+2} < c_{i, i+2}$. Then it follows that

$$ f_{i+1}(c_{i,i+2}) > f_i(c_{i,i+2}) \,, \quad f_{i+2}(c_{i,i+2}) > f_{i+1}(c_{i,i+2}) \quad \quad \implies \quad \quad f_{i+2}( c_{i,i+2}) > f_i(c_{i,i+2}) \,, $$

which is again a contradiction, since by definition $f_i(c_{i,i+2}) = f_{i+2}(c_{i,i+2})$.

Since neither $c_{i, i+2} < c_{i, i+1} < c_{i+1, i+2}$ nor $c_{i, i+1} < c_{i+1, i+2} < c_{i, i+2}$ is true, we must conclude the desired result, namely that $c_{i,i+1} \le c_{i, i+2} \le c_{i+1, i+2}$.

The above can be considered a base case for $k=2$. The general claim will follow when we show that, for $k\ge 2$, the following holds for all of the $c_{i, i+k}$:

$$ c_{i,i+1} \le c_{i, i+k} \le c_{i+k-1, i+k} \,. \tag{**} $$

The equation (**) allows us to conclude that the change in relative position between $f_i$ and $f_{i+k}$ (from $f_i \ge f_{i+k}$ to $f_{i+k} \ge f_i$) occurs inside a portion of the real line where $f_{i+1} \ge f_i$ and $f_{i+k-1} \ge f_{i+k}$; i.e. where neither $f_i$ nor $f_{i+k}$ is the largest function. This again has the effect that their change in relative position has no impact on the value of $f$, and therefore the inequality constraints with respect to the $c_{i,i+k}$ for $k \ge 2$ can be ignored. Thus only the $c_{i,i+1}$ matter for determining the value of $f$, and it follows that the claim above is true, namely that for $1 < i <n$, $\max_k f_k(x) = f_i (x)$ if and only if $ c_{i-1,i} \le x \le c_{i,i+1} $, that $\max_k f_k (x) = f_1 (x)$ if and only if $x \le c_{12}$, and $\max_k f_k(x) = f_n(x)$ if and only if $x \ge c_{n-1,n}$.

As was hinted above, we prove the case for general $k \ge 2$ by induction, using $k=2$ as the base case. So assume by means of induction (i.e. the following is our induction hypothesis) that we have already shown that, for a fixed $k$, the following is true for all $1 \le i \le n-k$:

$$ c_{i,i+1} \le c_{i, i+k} \le c_{i+k-1, i+k} \,. $$

Then we seek to show that the following is true for all $1 \le i \le n-k -1$:

$$ c_{i,i+1} \le c_{i,i+k+1} \le c_{i+k, i+k+1} \,. $$

Let $1 \le i \le n-k-1$ be given. Assume by means of contradiction that $c_{i, i+k+1} < c_{i,i+1} < c_{i+k, i+k+1}$. Hence:

$$ f_{i+k}(c_{i, i+k+1}) > f_{i+k+1}(c_{i, i+k+1}) \,. $$

By the induction hypothesis, we have that $c_{i, i+1} \le c_{i, i+k}$ -- thus $c_{i, i+k+1} < c_{i, i+k}$. Then we have that

$$ f_i(c_{i, i+k+1}) > f_{i+k}(c_{i, i+k+1}) \,, \quad \quad \implies \quad \quad f_i(c_{i, i+k+1}) > f_{i+k+1}(c_{i, i+k+1}) \,, $$

which is a contradiction, since by definition $f_i(c_{i, i+k+1}) = f_{i+k+1}(c_{i, i+k+1})$.

Now assume by means of contradiction that $c_{i+1} < c_{i+k, i+k+1} < c_{i, i+k+1}$. Then we have that

$$ f_{i+k+1} (c_{i, i+k+1}) > f_{i+k}(c_{i, i+k+1}) \,. $$

By the induction hypothesis, we have $c_{i, i+k} \le c_{i+k-1, i+k} < c_{i+k, i+k+1}$; therefore $c_{i, i+k} < c_{i, i+k+1}$. Thus:

$$ f_{i+k}(c_{i, i+k+1}) > f_i(c_{i, i+k+1}) \quad \quad \implies \quad \quad f_{i+k+1}(c_{i, i+k+1}) > f_i(c_{i, i+k+1}) \,, $$

which is again a contradiction, since by definition $f_i(c_{i, i+k+1}) = f_{i+k+1}$.

Thus, we have proven what we wanted to show, namely:

$$ c_{i,i+1} \le c_{i, i+k+1} \le c_{i+k, i+k+1} \,. $$

This conclusion allows us to write $f$ explicitly as a piecewise linear function:

$$ f: x \mapsto \begin{cases} a_1 x + b_1 & x \le \frac{b_1 - b_2}{a_2 - a_1} \\ a_i x + b_i & \frac{b_{i-1} - b_i}{a_i - a_{i-1}} \le x \le \frac{b_i - b_{i+1}}{a_{i+1} - a_i} \quad \forall 1 \le i \le n-1 \\ a_nx + b_n & x \ge \frac{b_{n-1} - b_n}{a_n - a_{n-1}} \end{cases} \,.$$

Best Answer

Related Solutions

[Math] Show that multivariate function is convex

[Math] Why is the Fenchel conjugate of a piecewise linear function piecewise linear

Related Question