Linear Algebra – Derivative of Inner Product

derivativesinner-productslinear algebravectors

If the inner product of some vector $\mathbf{x}$ can be expressed as

$$\langle \mathbf{x}, \mathbf{x}\rangle_G = \mathbf{x}^T G\mathbf{x}$$

where $G$ is some symmetric matrix, if I want the derivative of this inner product with respect to $\mathbf{x}$, I should get a vector as a result since this is the derivative of a scalar function by a vector (https://en.wikipedia.org/wiki/Matrix_calculus#Scalar-by-vector).

Nevertheless, this formula tells me that I should get a row-vector, and not a normal vector.

$$\frac{\mathrm{d}}{\mathrm{d} \mathbf{x}} (\mathbf{x}^TG\mathbf{x}) = 2\mathbf{x}^T G$$

(http://www.cs.huji.ac.il/~csip/tirgul3_derivatives.pdf)
which is a row-vector.

Why do I get this contradiction?

Best Answer

For a smooth $f:\mathbb{R}^n\to\mathbb{R}^m$, you have $df:\mathbb{R}^n\to\mathcal{L}(\mathbb{R}^n,\mathbb{R}^m)$

Being differentiable is equivalent to: $$ f(x+h)=f(x)+df(x)\cdot h+o(\|h\|) $$

In your case, $f(x)=\langle x,x \rangle_G$ and $m=1$, hence differential at $x$, $df(x)$ is in $\mathcal{L}(\mathbb{R}^n,\mathbb{R})$. It's a linear form.

Let's be more explicit: \begin{align*} f(x+h)=& \langle x+h,x+h \rangle_G \\ =& \underbrace{\langle x,x \rangle_G}_{f(x)} + \underbrace{2\langle x,h \rangle_G }_{df(x)\cdot h}+ \underbrace{\langle h,h \rangle_G}_{\in o(\|h\|)}\\ \end{align*}

Hence your differential is defined by $$ df(x)\cdot h = 2\langle x,h \rangle_G = (2x^tG)h $$ where $2x^tG=\left(\partial_{x_1} f,\dots,\partial_{x_n} f\right)$ is your "row" vector.

Note that, because $m=1$, you can also use a vector $\nabla f(x)$ to represent $df(x)$ using the canonical scalar product. This vector is by definition the gradient of $f$:

$$ df(x)\cdot h = \langle \nabla f(x),h \rangle = \langle 2Gx,h \rangle $$ where $\nabla f(x)=2Gx=\left(\begin{array}{c}\partial_{x_1} f \\ ... \\\partial_{x_n} f\end{array}\right)$. This is your "column" vector.