[Math] Chain rule for vector valued function

calculusderivativesvectors

I am stuck with seemingly simple chain rule application. Consider this vector valued function:

$f(\alpha,\beta)=\begin{pmatrix}\alpha^{2} \\ -\beta\end{pmatrix}$

Now I need to compute following derivative:

$$\frac{\partial}{\partial \alpha}(f\circ f)'$$

According to chain rule, that should be something like:

$f'(f))\cdot f'=\begin{pmatrix}2f \\ -1\end{pmatrix}\cdot\begin{pmatrix}2 \\ -1\end{pmatrix}$

However, this results in multiplying two column vectors together, which is not possible.

Where am I doing the mistake? Any help would be appreciated.


Update (more complicated case):

I have functions that map 2D points to another 2D points, the point is defined as:

$\mathbb{x}=\begin{pmatrix}x \\ y\end{pmatrix}$

Now there is one linear mapping:

$P(\mathbb{x}|a,b,c,d)=\begin{pmatrix}a & b \\ c & d\end{pmatrix}\mathbb{x}$

And one non-linear:

$L(\mathbb{x}|\kappa)=\begin{pmatrix}x^{2}\kappa \\ xy\kappa\end{pmatrix}$

Now how to compute gradient of the function:

$p=L(P\mathbb{x})$ ?

the gradient should be:

$\nabla p=\begin{pmatrix}\frac{\partial}{\partial a} \\ \frac{\partial}{\partial b} \\ \vdots \\ \frac{\partial}{\partial \kappa} \end{pmatrix}$

now the single element of the gradient should be computable using chain rule, e.g.:

$\frac{\partial}{\partial a}p=\frac{\partial}{\partial a}L(P\mathbb{x})\cdot \frac{\partial}{\partial a}P\mathbb{x}$

The first element of the product will become 2-vector and the second also…

Furthermore, if I go for total derivatives, since there are only 2 coordinates (x,y) and 5 variables, the Jacobians would be $2\times 5$ and therefore cannot be multiplied together.

Best Answer

I will do out the example you mention in this third version of your question.

$P$ is a function $P:\mathbb{R}^6\to\mathbb{R}^2$ defined by $$P\begin{pmatrix} x\\y\\a\\b\\c\\d \end{pmatrix} =\begin{pmatrix}a&b\\c&d\end{pmatrix}\begin{pmatrix}x\\y\end{pmatrix}=\begin{pmatrix} ax+by\\cx+dy \end{pmatrix}$$ $L$ is a function $:\mathbb{R}^3\to\mathbb{R}^2$ defined by $$L\begin{pmatrix} \kappa\\x\\y \end{pmatrix}=\begin{pmatrix} x^2\kappa\\xy\kappa \end{pmatrix}$$ Therefore $L$ and $P$ cannot be composed. However, for any given value of $\kappa$ you can form the function $L_\kappa:\mathbb{R}^2\to\mathbb{R}^2$; for example if $\kappa=3$, then the function $L_3$ is defined by $$L_3\begin{pmatrix} x\\y\end{pmatrix}= \begin{pmatrix} 3x^2\\ 3xy \end{pmatrix}$$ Then $P$ can be composed with any choice of these functions to form $(L_\kappa\circ P):\mathbb{R}^6\to\mathbb{R}^2$.


Here are the derivatives: (I am using the $0$ subscripts to encourage you to think of them all as fixed numbers.)

$$P'\begin{pmatrix} x_0\\y_0\\a_0\\b_0\\c_0\\d_0 \end{pmatrix} =\begin{pmatrix} a_0 & b_0 & x_0 & y_0 & 0 & 0\\ c_0 & d_0 & 0 & 0 & x_0 & y_0 \end{pmatrix}$$ For any given $\kappa_0$, $$L_{\kappa_0}'\begin{pmatrix} x_0\\ y_0 \end{pmatrix}=\begin{pmatrix} 2\kappa_0x_0 & 0\\ \kappa_0y_0 & \kappa_0x_0 \end{pmatrix}$$ Therefore $$\begin{align*} (L_{\kappa_0}\circ P)'\begin{pmatrix} x_0\\y_0\\a_0\\b_0\\c_0\\d_0 \end{pmatrix}&=L_{\kappa_0}'\left(P\begin{pmatrix} x_0\\y_0\\a_0\\b_0\\c_0\\d_0 \end{pmatrix}\right)\cdot P'\begin{pmatrix} x_0\\y_0\\a_0\\b_0\\c_0\\d_0 \end{pmatrix}\\\\\\ &=L_{\kappa_0}'\begin{pmatrix} a_0x_0+b_0y_0\\c_0x_0+d_0y_0 \end{pmatrix}\cdot P'\begin{pmatrix} x_0\\y_0\\a_0\\b_0\\c_0\\d_0 \end{pmatrix}\\\\\\ &=\underbrace{\begin{pmatrix} 2\kappa_0(a_0x_0+b_0y_0) & 0\\ \kappa_0(c_0x_0+d_0y_0) & \kappa_0(a_0x_0+b_0y_0) \end{pmatrix}}_{\text{a }2\times 2\text{ matrix}}\underbrace{\begin{pmatrix} a_0 & b_0 & x_0 & y_0 & 0 & 0\\ c_0 & d_0 & 0 & 0 & x_0 & y_0 \end{pmatrix}}_{\text{a }2\times 6\text{ matrix}} \end{align*}$$

Related Question