Why is this function convex

real-analysis

In the paper Dissimilarity in Graph-Based Semi-Supervised Classification, there are few things I could not understand.
Given that $x_1, x_2,…, x_n$ are the vector representation of $n$ items, $f : X \rightarrow \mathbb{R}$ is the discriminant function, and $w_{ij}$ are all non-negative,

why is $ \frac{1}{2} \sum_{i, j = 1}^n w_{ij} (f(x_i)- f(x_j))^2$ convex w.r.t to $f$?
Also, why does changing any $w_{ij}$ to negative value make it non-convex?

I read on wikipedia that

"A twice-differentiable function of a single variable is convex if and
only if its second derivative is nonnegative on its entire domain"

But I don't understand derivative should be take w.r.t to what?

Best Answer

Call the sum $S(f)$. Let $0<b<1$, and let $f$ and $g$ be two functions mapping $X$ to $\Bbb R$. Then $$ S(bf+(1-b)g)=b^2S(f)+(1-b)^2S(g)+b(1-b)\sum_{i,j}w_{i,j} [f(x_i)-f(x_j)]\cdot[g(x_i-g(x_j)]. $$ By Cauchy-Schwarz, $$ \sum_{i,j}w_{i,j} [f(x_i)-f(x_j)]\cdot[g(x_i-g(x_j)]\le 2\sqrt{S(f)S(g)}. $$ (The non-negativity of $w_{i,j}$ is used here: $w_{i,j} = \sqrt{w_{i,j}}\cdot \sqrt{w_{i,j}}$.) Therefore $$ \eqalign{ S(bf+(1-b)g) &\le b^2S(f)+(1-b)^2S(g)+2b(1-b)\sqrt{S(f)S(g)}\cr &= \left[b\sqrt{S(f)}+(1-b)\sqrt{S(g)}\right]^2\cr &\le bS(f)+(1-b)S(g), } $$ the final inequality because the square function is convex. This shows that $f\mapsto S(f)$ is convex. (Intuitively, $f\mapsto [f(x_i)-f(x_j)]^2$ is convex because the square is convex; and then $S$ is convex becasue it's a positive-linear combination of such functions of $f$.)

Related Question