[Math] Chain rule and inverse in matrix calculus

calculuslinear algebramatrices

I am having trouble understanding the derivation of some seemingly simple matrix derivatives and am wondering if there is an intuitive (perhaps geometric) explanation. I am reasonably well-versed in multivariate calculus and linear algebra, but am not comfortable with tensor math.

The function I am interested in is $f(t)=\mathbf{B}^T(\mathbf{X}+t\mathbf{Y})^{-1}\mathbf{A}$, where $t$ is a scalar, and $\mathbf{A},\mathbf{B},\mathbf{X},\mathbf{Y}$ are matrices with conformant dimensions.

On the page 24 of the pdf of the appendix on matrix calculus in the book by Jon Dattorro (page 600 of the book), I find the formula for the first derivative of $f(t)$:

$$\frac{df}{dt}=-\mathbf{B}^T(\mathbf{X}+t\mathbf{Y})^{-1}\mathbf{Y}(\mathbf{X}+t\mathbf{Y})^{-1}\mathbf{A}$$

This sort of makes sense to me from my knowledge of calculus of functions of single variable: if you have $g(t)=a(x+ty)^{-1}b=ab(x+ty)^{-1}$, then $\frac{dg}{dt}=-ab(x+ty)^{-2}y=-a(x+ty)^{-1}y(x+ty)^{-1}b$ (from the chain rule and the power rule). That is, there is a clear similarity in the form.

What I don't understand is why the matrix equation for $\frac{df}{dt}$ looks the way it does. Is it due to non-commutativity of matrix multiplication? But how does that come in to this problem exactly? I've found the chain rule for matrix-valued function in the same pdf on page 8 (eq 1749) but I am not sure how to apply it here. Maybe I don't understand something about the calculus of the single-variable functions.

I guess I am asking if there is a way to derive the equation for $\frac{df}{dt}$ "from first principles" without using tensors.

Best Answer

I think this follows more quickly from the product rule. The derivative of $t \mapsto X + tY$ is $Y$. You have $$0 = \frac{d}{dt}I = \frac{d}{dt} [(X+tY)^{-1}(X+tY)] = \frac{d}{dt}(X+tY)^{-1} * (X+tY) + (X+tY)^{-1}Y$$ and so $$\frac{d}{dt} (X+tY)^{-1} = -(X+tY)^{-1} Y (X+tY)^{-1}.$$ Multiplying on the left and right by $B^T$ and $A$ won't change much.

Related Question