Multivariable Calculus – Deriving the Gradient of x^TAx

multivariable-calculus

I was trying to take the gradient of $x^TAx$ i.e. $\nabla_xx^TAx$.

I did have one idea of how to do this which was expression $x^TAx$ as a double summation and then take the partial derivatives wrt to each $x_i$. However, it seemed a little ugly to me, and therefore, was wondering if there was a different maybe more cleaver/cleaner way to derive it.

The kind of idea I was thinking to apply was maybe the product rule of gradients:

$$\nabla_x(uF) = \nabla_xuF + u\nabla_xF$$

Where u is a scalar function and F is a vector field. (I got the above from the following OCW video).

However, I was not 100% sure how to apply it. The issue I had was that if I let $u = x^T$ and $Ax = F$, then F does correspond to a vector field because to start with, it is a valid vector. However, $u = x^T$ does not correspond correctly to a scalar function because its a vector. I guess maybe I am not using the most general version of the product rule? Is there a more general version of it such that it yields a nicer derivation for $\nabla_xx^TAx$?

Best Answer

$$J(x) = x^TAx = \sum_{i,j} A_{ij}x_ix_j \implies \dfrac{dJ(x)}{dx_k} = \sum_{i,j} A_{ij} \dfrac{d(x_ix_j)}{dx_k}$$ We have $$\dfrac{d(x_ix_j)}{dx_k} = \delta_{ik}x_j + \delta_{jk}x_i$$ Hence, \begin{align} \dfrac{dJ(x)}{dx_k} & = \sum_{i,j} A_{ij} \left(\delta_{ik}x_j + \delta_{jk}x_i\right) = \sum_{j}A_{kj}x_j + \sum_{i}A_{ik}x_i = \left((Ax)_k + (A^Tx)_k\right) \end{align} Hence, $$\dfrac{dJ}{dx} = \left(A+A^T\right)x$$