[Math] How to derive the Levenberg–Marquardt algorithm with matrix calculus

According to the wikipedia article: http://en.wikipedia.org/wiki/Levenberg_Marquardt

—

$S(\boldsymbol\beta+\boldsymbol\delta) \approx \|\mathbf{y} – \mathbf{f}(\boldsymbol\beta) – \mathbf{J}\boldsymbol\delta\|^2$

Taking the derivative with respect to δ and setting the result to zero gives:

$(J^{T}J)\boldsymbol \delta = J^{T} [y – f(\boldsymbol \beta)])$

—

My attempt to derive the equation:

$\|\mathbf{y} – \mathbf{f}(\boldsymbol\beta) – \mathbf{J}\boldsymbol\delta\|^2
= (\mathbf{y} – \mathbf{f}(\boldsymbol\beta) – \mathbf{J}\boldsymbol\delta)^T(\mathbf{y} – \mathbf{f}(\boldsymbol\beta) – \mathbf{J}\boldsymbol\delta)$

using product rule:

$\frac{\partial \|\mathbf{y} – \mathbf{f}(\boldsymbol\beta) – \mathbf{J}\boldsymbol\delta\|^2}{\partial \boldsymbol\delta} = (-J^{T})(\mathbf{y} – \mathbf{f}(\boldsymbol\beta) – \mathbf{J}\boldsymbol\delta) + (\mathbf{y} – \mathbf{f}(\boldsymbol\beta) – \mathbf{J}\boldsymbol\delta)^T(-J)$

The dimensions of the left and right side don't match. I believe there might be something wrong with my differentiation. There seems to be a transpose missing, but I'm not sure what would cause a transpose in the differentiation operation.

[Math] How to derive the Levenberg–Marquardt algorithm with matrix calculus

Best Answer

Related Question

Best Answer

Related Solutions

[Math] the dot product between a vector of matrices

[Math] column-reducing algorithm for finding nullspace of matrix

Related Question