Derivative of a scalar-valued function of a matrix

calculusmatricesmatrix-calculus

Consider a scalar-valued function of a Matrix:
$$s = g(\mathbf{T})$$ where $\mathbf{T}$ is a matrix.
Now consider $\mathbf{T}$ is also a function of a scalar variable $t$:
$$s = g(\mathbf{T}(t))$$
The goal is to find the derivative of $s$ with respect to $t$.
I approached this problem using the chain rule:
$$\dot s=\frac{\partial g}{\partial t}=\frac{\partial g}{\partial \mathbf{T}}\cdot\frac{\partial \mathbf{T}}{\partial t}$$
The problem is, the above expression results in a matrix, whereas I am expecting a scalar. Where am I wrong?

Best Answer

The chain rule here is$$\frac{ds}{dt}=\frac{\partial s}{\partial T_{ij}}\frac{dT_{ij}}{dt}.$$While both factors are matrices, your mistake was thinking they're "dotted" in a way that forms a matrix, e.g. by matrix multiplication. But as we contract both indices, it's really a quantity of the form $\operatorname{Tr}(A^TB)$.