[Math] Derivative of scalar function with respect to matrix with vectors involved

matrix-calculus

I want to find derivative of a scalar function with respect to matrix $A$:

$
E=\|f(Ax)\|^2
$

Where $f(Ax)$ is a vector, say color of pixel at position $Ax$. How can I do that, given that I can compute derivative of $f$ with respect to its argument $\partial{f}/\partial{x}$?

I know that I can rewrite function $E$ like this

$
E=f(Ax)^Tf(Ax)
$

or

$
E=tr(f(Ax)f(Ax)^T)
$

Is there a way to find derivative using matrix-vector operations? E.g. without computing derivatives with respect to individual matrix elements. Is there a general analog of chain rule? Say for $g(f(Ax))$, where g is scalar and f is vector.

Best Answer

For convenience, define the variables $$\eqalign{ y &= Ax,\,\,\,f = f(y),\,\,\, J = \frac{\partial f}{\partial y} \cr }$$ Then write the function in terms of the Frobenius (:) Inner Product and take its differential $$\eqalign{ E &= f:f \cr\cr dE &= 2f:df \cr &= 2f:J\,dy \cr &= 2f:J\,dA\,x \cr &= 2J^Tfx^T:dA \cr }$$ Since $dE = \big(\frac{\partial E}{\partial A}:dA\big),\,$ the gradient must be $$\eqalign{ \frac{\partial E}{\partial A} &= 2\,J^Tfx^T \cr }$$

Related Question