[Math] Derivative of a matrix: Outer product chain rule

calculusmatricesnormed-spaces

I ran into a seemingly simple matrix calculus question that I can't seem to find the solution to.

Suppose I have the following matrices: $X_{(t \times n)}, V_{(n \times m)}$, and $\Phi_{(t\times m)} = f(XV)$ for some differentiable function $f$, which is applied element-wise to the argument $XV$.

I would like to calculate $\frac{\partial}{\partial V} \|1^T\Phi\|_2^2$, which I expanded to the outer product (hopefully correctly) as $\frac{\partial}{\partial V} 1^T \Phi\Phi^T 1 = \frac{\partial}{\partial V} 1^T f(XV) f(XV)^T 1^T$.

The Matrix Cookbook states that $\frac{d}{dx} \|x\|_2^2 = \frac{d}{dx} \|x^Tx\|_2 = 2x$. However, I'm not 100% certain I can use this in my case.

So far I have that $\frac{\partial}{\partial V} 1^T f(XV) f(XV)^T 1 = 2X^T[f(XV) \circ f^\prime(XV)]$ but my gradient checker (gradest in Matlab) is saying this is incorrect. I've been stuck on this all day, can anyone help?

I'm trying to figure out a vectorized solution (not involving for loop summations) since this piece of code will be called iteratively for optimization.

Edit: I've confirmed that $\frac{d}{d\Phi} \|1^T \Phi \|_2^2 = 2 \cdot 1 1^T \Phi$.

Best Answer

Let $Y=1^T\Phi$, then the problem is to find the derivative of the function $\,L=\|Y\|_F^2$

Better yet, using the Frobenius product, the function can be written as $\,L=Y:Y$

Start by taking the differential $$\eqalign{ dL &= 2\,Y:dY \cr &= 2\,1^T\Phi:1^Td\Phi \cr &= 2\,11^T\Phi:d\Phi \cr &= 2\,11^T\Phi:\Phi'\circ d(XV) \cr &= 2\,(11^T\Phi)\circ\Phi':d(XV) \cr &= 2\,(11^T\Phi)\circ\Phi':X\,dV \cr &= 2\,X^T[(11^T\Phi)\circ\Phi']:dV \cr }$$ Since $dL = \big(\frac {\partial L} {\partial V}\big):dV\,\,$ the derivative must be $$\eqalign{ \frac {\partial L} {\partial V} &= 2\,X^T[(11^T\Phi)\circ\Phi'] \cr }$$ This is the same result as @legomygrego, but with the step-by-step details. The only property which might be new to some readers is the mutual commutivity of the Frobenius and Hadamard products $$\eqalign{ A:B &= B:A \cr A\circ B &= B\circ A \cr A\circ B:C &= A:B\circ C \cr }$$