Matrices – Applying the chain rule on vectors and matrices

chain rulederivativesmatricesreal numbersvectors

I need to find $\frac{dy}{dx}$ for the following

y = $||A^Tx – b||_2^2$ where $A \in R^{3×3}, b \in R^{3×1}, x \in R^{3×1}, y \in R,$ and $||.||_2$ is the euclidean norm so for example $||z||_2^2 = z^Tz$ for $z \in R^{3×1}$. I'm familiar with the chain rule but I've never really used it in this way. Also, I'm not sure what $R^{3×3}$ represents and how I can use it with the chain rule.

Best Answer

Define the vector $$z=A^Tx-b$$ Write the function in terms of this new vector. Then find its differential and gradient. $$\eqalign{ y &= z^Tz \cr dy &= 2z^Tdz = 2z^T(A^Tdx) = (2Az)^Tdx \cr \frac{\partial y}{\partial x} &= 2Az = 2A(A^Tx-b) \cr\cr }$$ The symbol ${\mathbb R}^{m\times n}$ denotes a matrix of real numbers with $m$ rows and $n$ columns.

Related Question