Ordinary Least Squared derivation using Index Notation

The following function in $\mathbf{x}$ is given:
$$f(\mathbf{x}) = \frac{1}{2}\left\lVert \mathbf{Ax – b} \right\rVert_2^2$$

I want to calculate $\frac{\mathrm{d}f(\mathbf{x})}{\mathrm{d}\mathbf{x}}$ using index notation, where the conversion from linear algebra notation should happen immediately (i.e. no expansion of the square using $\left\lVert \mathbf{x} \right\rVert_2^2 = \mathbf{x}^\intercal\mathbf{x}$). Therefore the above is rewritten as $$ f(\mathbf{x}) = \frac{1}{2} (A_{ij}x_j – b_i)^2. $$ Using the chain rule, I immediately arrive at $$ \frac{\mathrm{d}f(\mathbf{x})}{\mathrm{d}\mathrm{x}_l} = 2\left(\frac{1}{2}(A_{ij}x_j-b_i)A_{ij}\delta_{jl}\right) = A_{il}^2x_l-A_{il}b_i, $$ or, converted to LA notation $$\frac{\mathrm{d}f(\mathbf{x})}{\mathrm{d}\mathbf{x}} = \mathbf{A}\mathbf{A}\mathbf{x}-\mathbf{A}\mathbf{b} $$ which is of course the wrong solution. I am fairly confident with matrix calculus using matrix notation, but I cant wrap my head around where the needed transpose would come from using the index notation (one can see that the result would be correct if the latter $A_{ij}$ in the above were $A_{ji}$, i.e. the transpose).

Ordinary Least Squared derivation using Index Notation

Best Answer

Related Question

Best Answer

Related Solutions

[Math] prove $\frac{1}{2}\mathbf{\nabla (u \cdot u) = u \times (\nabla \times u ) + (u \cdot \nabla)u}$ using index notation

[Math] Derivative of transpose of inverse of matrix with respect to matrix

Related Question