[Math] Product rule for vector-valued functions

derivativeslinear algebramatricesmatrix-calculusmultivariable-calculus

I'm trying to wrap my head around how to apply the product rule for matrix-valued or vector-valued matrix functions.

Specifically, I'm trying to work through how to apply the product rule to $$x^TAx = f(x)g(x)$$
where $f(x) = x^T$, $g(x)=Ax$, $x\in\mathbb{R}^N$, and $A\in \mathbb{R}^{NxN}$

I know that $\nabla_x x^TAx = (A + A^T)x$ or $x^T(A + A^T)$ depending on the layout, however I'm just trying to use this as an example to see if I can get the same result with the product rule.

This question explains it for scalar-valued functions as $$f(x)\nabla_x g(x)+g(x)\nabla_x f(x).$$

However things don't have the correct dimensions when I plug in the values in the above, namely. As Travis wrote in the comment below, we should have:

$$ \nabla_x(x^TAx) = (\nabla_x x^T)Ax + x^T\nabla_x(Ax) $$

however that still leaves you with at least an $x$ in the first expression and an $x^T$ in the second. I don't see how that can conform and how it leaves you with $(A + A^T)x$ or $x^T(A + A^T)$

This question is essentially asking the same thing, but the answer doesn't really involve the product rule above. I figure there must be some general formula to apply, as with scalar-valued functions.

Am I writing the product rule correctly in this case? Is there somethign I'm missing or doing incorrectly?

EDIT:

Building off of Algabraic Pavel's answer… I think the problem is that you have to formulate the functions $f(x)$ and $f(x)$ so their in the same space.

That is, for $f,g:\mathbb{R}^N\rightarrow \mathbb{R}^M$, the product rule is:

$$\nabla_x (f(x)^Tg(x)) = f(x)^T\nabla_x g(x) + g(x)^T \nabla f(x)$$

So in the example above, if we let $f(x) = x$, $g(x)=Ax$, then the formula holds.

As another example, consider $$Axx^T$$ and let $f(x) = x^T A^T$ and $g(x) = x^T$. We have both $f,g:\mathbb{R}^{Nx1} \rightarrow \mathbb{R}^{1xN}$ and

$$\nabla_x (f(x)^Tg(x)) = \nabla_x (Axx^T) = Ax + xA^T$$

which holds, notice that if we made $f(x) = Ax$ and not $f(x) = (Ax)^T$, the rule falls apart.

I still don't know if this holds in all instances though. Any counter examples?

Best Answer

It all depends on the conventions you use. Examine the product rule derivative component by component and get that in this case it gives you $$ \tag{1} \nabla_x[f(x)^Tg(x)]=f(x)^T\nabla_xg(x)+g(x)^T\nabla_x f(x). $$ So with $f(x):=x$ and $g(x):=Ax$, we have $$ \nabla_x(x^TAx)=x^TA+x^TA^T=x^T(A+A^T). $$


If $f,g:\mathbb{R}^n\to\mathbb{R}^m$, then $$ \frac{\partial}{\partial x_j}f^Tg= \frac{\partial}{\partial x_j}\sum_{i=1}^mf_ig_i= \sum_{i=1}^m\left(f_i\frac{\partial g_i}{\partial x_j}+g_i\frac{\partial f_i}{\partial x_j}\right). $$ So defining $$ \nabla_x f=\left(\frac{\partial f_i}{\partial x_j}\right)_{ij} $$ gives (1).