[Math] Gradient of a scalar function acting on a vector function

multivariable-calculus

If I have a vector function that is constructed from a scalar function acting on a vector function, what is it's gradient?

$$\psi(x)=\phi(f(x))$$
where
$$x\in\mathbb{R}^n, f\in\mathbb{R}^n\rightarrow\mathbb{R}^1, \phi\in\mathbb{R}^1\rightarrow\mathbb{R}^1$$

Is the following correct?
$$\nabla\psi(x)=\nabla(\phi(f(x)))=\frac{d\phi}{df}\cdot{\nabla}f(x)$$
where ${\nabla}f(x)=\left[\frac{df}{dx_1},\frac{df}{dx_2},…,\frac{df}{dx_n}\right]^T$

Best Answer

Note that $d\phi(y)$ is just multiplication by the scalar $\phi'(y)$. The chain rule $$d\psi(x)=d\phi\bigl(f(x)\bigr)\circ df(x)$$ therefore implies $$\nabla\psi(x)\cdot X=d\psi(x).X=d\phi\bigl(f(x)\bigr).\bigl(df(x).X\bigr) =\phi'\bigl(f(x)\bigr) \bigl(\nabla f(x)\cdot X\bigr)\ .$$ Since this is true for all $X\in{\mathbb R}^n$ it follows that $$\nabla\psi(x)=\phi'\bigl(f(x)\bigr)\nabla f(x)\ .$$

Related Question