[Math] Partial Derivative of Gaussian function: Matrix differentiation

derivativesmatrix-calculuspartial derivative

I am interested in partial derivative of the following term w.r.t $x_1$

$$\mathbf g = \begin{bmatrix}x_1-k_1 ,& x_2-k_2, & x_3-k_3 \end{bmatrix} \begin{bmatrix}s_{11}& s_{12} & s_{13} \\ s_{21}& s_{22} & s_{23} \\s_{31}& s_{32} & s_{33} \end{bmatrix}^{-1} \begin{bmatrix}x_1-k_1 \\ x_2-k_2 \\ x_3-k_3 \end{bmatrix}$$
My real problem is to partially differentiate Gaussian function w.r.t $\mathbf x$ which is defined as
$$ f = \exp\left(-\frac{1}{2}({\mathbf x}-{\boldsymbol\mu})^\mathrm{T}{\boldsymbol\Sigma}^{-1}({\mathbf x}-{\boldsymbol\mu})\right)$$

$\Sigma$ is positive semidefinite symmetric matrix

Here I am stuck with the derivative inside $\exp(.)$ term. For $\mathbf{x}\in \mathbb{R}^2$, I could solve it as follows

$$\begin{align}
\mathbf g &= \begin{bmatrix}x_1-k_1 ,& x_2-k_2 \end{bmatrix}\begin{bmatrix}\lambda_{11}& \lambda_{12} \\ \lambda_{21}& \lambda_{22}\end{bmatrix}\begin{bmatrix}x_1-k_1 \\ x_2-k_2\end{bmatrix} \\
& = \lambda_{11}(x_1-k_1)^2+(\lambda_{12}+\lambda_{21})(x_1-k_1)( x_2-k_2)+\lambda_{22}(x_2-k_2)^2 \\
\frac{\partial \mathbf g }{\partial x_1}
& = 2\lambda_{11}(x_1-k_1)+(\lambda_{12}+\lambda_{21})(x_2-k_2)
\end{align}
$$

where, $\begin{bmatrix}\lambda_{11}& \lambda_{12} \\ \lambda_{21}& \lambda_{22}\end{bmatrix} = \begin{bmatrix}s_{11}& s_{12}\\ s_{21}& s_{22} \end{bmatrix}^{-1} $

But as the dimension increases it becomes complicated. Can someone guide me how differentiation of matrices can be carried out . I am looking something of this sort and even before that I would like to know, is it possible?.

$$\frac{\partial \mathbf g}{\partial x_1} = \frac{\partial}{\partial x_1} (\mathbf x – \mathbf k)^{\rm T}\Sigma^{-1}\mathbf x + \mathbf x ^{\rm T}\Sigma^{-1}\frac{\partial}{\partial x_1}(\mathbf x – \mathbf k)$$

Best Answer

For convenience, let $$\eqalign{ y &= x-k \cr M &= M^T = \Sigma^{-1}\cr }$$ Write the function in terms of these variables and the Frobenius (:) Inner Product and find its differential $$\eqalign{ g &= M:yy^T \cr\cr dg &= M:(dy\,y^T+y\,dy^T)\cr &= (M+M^T)y:dy \cr &= 2\,My:dy \cr &= 2\,\Sigma^{-1}(x-k):dx \cr }$$ Since $dg=\big(\frac{\partial g}{\partial x}:dx\big),\,$ the gradient is $$\eqalign{ \frac{\partial g}{\partial x} &= 2\,\Sigma^{-1}(x-k) \cr }$$ To find the derivative wrt $x_1$ dot the gradient with the $1^{st}$ basis vector $$\eqalign{ \frac{\partial g}{\partial x_1} &= e_1^T \,\frac{\partial g}{\partial x} \cr &= 2\,e_1^T\,\Sigma^{-1}(x-k) \cr }$$

Related Question