Implicitly Differentiating a System of Matrix Equations

implicit-differentiationimplicit-function-theoremmatrix equationsmatrix-calculus

Setup

Let $\mathbf a$ be an arbitrary $m\times 1$ vector, $\mathbf B$ be an arbitrary $n\times m$ matrix, with $m>n$, and $\mathbf C$ be a symmetric $m\times m$ matrix. The scalar $\lambda$ is also given.

The $n\times 1$ vector $\mathbf h$, the $m\times m$ matrix $\mathbf \Gamma$, and the scalars $w$, and $s$ satisfy the following system of equations:
\begin{align*}
\mathbf h = & \;\left(\mathbf B \,{\mathbf \Gamma}\,\mathbf B'\right)^{-1}\,\mathbf B\,{\mathbf\Gamma}\,\mathbf{a} ,\\[2ex]
\mathbf {\Gamma} =&\;\mathbf I_m +w\,\mathbf C,\\[2ex]
w=&\;\frac{1}{1-\lambda s},\\[2ex]
s=&\ \left(\mathbf h'\,\mathbf B-\mathbf a' \right)\mathbf {C}\left(\mathbf B^{\prime}\,\mathbf h -\mathbf a \right).
\end{align*}


Question

I want to find

$$\frac{\partial w}{\partial \lambda}=\mathbf?$$

The problem is that changing $\lambda$ changes $w$ which changes $\mathbf \Gamma$, and so on. So, I have to somehow use the Implicit Function Theorem. For systems of scalar variables, I am used to totally differentiating the system, but I am not sure how to deal with the matrix equations in that case. Any help would be very welcome.

Best Answer

This is a pretty straightforward application of the chain rule. Given

$$\begin{aligned} 0 &= f(\lambda, w(\lambda), s(\lambda)) = (1-\lambda s) w - 1 \\[2ex] \implies 0 &= \frac{{\rm d\,} f(\lambda, w(\lambda), s(\lambda))}{{\rm d\,} \lambda} = \frac{\partial f}{\partial \lambda} + \frac{\partial f}{\partial w}\frac{\partial w}{\partial \lambda} + \frac{\partial f}{\partial s}\frac{\partial s}{\partial {\mathbf h}}\frac{\partial {\mathbf h}}{\partial w}\frac{\partial w}{\partial \lambda} \\[2ex] &\iff \Big(\frac{\partial f}{\partial w} + \frac{\partial f}{\partial s}\frac{\partial s}{\partial {\mathbf h}}\frac{\partial {\mathbf h}}{\partial w}\Big)\frac{\partial w}{\partial \lambda} = -\frac{\partial f}{\partial \lambda} \end{aligned}$$

So, your derivative can be found by solving the linear system with $-\frac{\partial f}{\partial \lambda} = sw$, $\frac{\partial f}{\partial w} = 1-\lambda s$, $\frac{\partial f}{\partial s}=\lambda w$ and $\frac{\partial s}{\partial {\mathbf h}} = 2{\mathbf B}{\mathbf C}({\mathbf B}'{\mathbf h}-{\mathbf a})$. With regard to $\frac{\partial {\mathbf h}}{\partial w}$, we could do it directly, using $\partial X^{-1}= -X^{-1} (\partial X)X^{-1}$, but we also just apply the implicit function theorem a second time:

$$\begin{aligned} 0 &= g(w, {\mathbf h}(w)) = \left(\mathbf B (\mathbb I + w \mathbf C) \mathbf B'\right) \mathbf h - \mathbf B(\mathbb I + w \mathbf C)\mathbf{a} \\[2ex] \implies 0 &= \frac{{\rm d\,} g(w, {\mathbf h}(w))}{{\rm d\,} w} = \frac{\partial g}{\partial w} + \frac{\partial g}{\partial {\mathbf h}}\frac{\partial {\mathbf h}}{\partial w} \\[2ex] &\iff \frac{\partial g}{\partial {\mathbf h}}\frac{\partial {\mathbf h}}{\partial w} = -\frac{\partial g}{\partial w} \end{aligned}$$

where $\frac{\partial g}{\partial w} = {\mathbf B}{\mathbf C}{\mathbf B}'h - {\mathbf B}{\mathbf C}a$ and $\frac{\partial g}{\partial {\mathbf h}} = {\mathbf B}{\mathbf \Gamma} {\mathbf B}'$. So in total your derivative is given by solution of the nested linear system

$$\begin{aligned} (1)&&&{\mathbf B}{\mathbf \Gamma} {\mathbf B}' \tfrac{\partial \mathbf h}{\partial w} = -{\mathbf B}{\mathbf C}{\mathbf B}'\mathbf h + {\mathbf B}{\mathbf C}\mathbf a \\ (2)&&&\big(1-\lambda s + 2\lambda w (\mathbf a'- {\mathbf h}'{\mathbf B}){\mathbf C}{\mathbf B}'\tfrac{\partial \mathbf h}{\partial w}\big)\tfrac{\partial w}{\partial \lambda} = sw \end{aligned}$$

If you want to, you can plug everything back in and try to simplify it. As a final remark, you may wonder why $\frac{\partial s}{\partial {\mathbf h}}$ appears transposed in the formula, this is due to how the tensor-contraction between $\frac{\partial s}{\partial {\mathbf h}}$ and $\frac{\partial {\mathbf h}}{\partial w}$ operates. You can either write it out or realize that the result needs to be a scalar again.

Related Question