We know that the inverse of the Fisher information is of the form:
$$F^{-1} = \left[
\begin{array}{cc}
A^{-1}+A^{-1}B(F/A)^{-1}CA^{-1}& ... \\
... & ...
\end{array}\right]$$
where $(F/A) = D - CA^{-1}B$ is the Schur complement of block $D$.
Let's show that diagonal elements of $A^{-1}+A^{-1}B(F/A)^{-1}CA^{-1}$ are bigger than the one of $A^{-1}$, which is equivalent to proving that the diagonal elements of the matrix $R = A^{-1}B(F/A)^{-1}CA^{-1}$ are positive.
Note $p$ the dimension of the upper left block.
We'll just proove that for all $e_i$, vector of the canonical basis of $\mathbb{R}^p$,
$$e_i^T R e_i \geq 0 .$$ Indeed, as $e_i^T R e_i = R_{ii}$, this will prove what we want.
First note that as $F$ is the Fisher information matrix, it is symmetrical and positive definite. So we get that:
- $A$ is positive definite,
- $B^T = C$,
- Schur complements of $F$, $F/A$ and $F/D$, are symmetrical positive definite.
As $F/A$ is symetrical positive definite, its inverse $(F/A)^{-1}$ also is, and therefore there existe a symetrical definite matrix $Q$ such that $(F/A)^{-1} = Q^T Q$.
Using that, we can write $R$ as
$$R = A^{-1} C^T Q^T QCA^{-1} = \left(Q C A^{-1}\right)^T\left(QCA^{-1}\right)$$
Therefore $$e_i^T R e_i = (QCA^{-1}e_i)^T(QCA^{-1}e_i) = \lVert QCA^{-1}e_i\rVert_2 \geq 0. $$
Hence the result.
Hope this is useful.
Additional note : Proof that $F^{-1} = \left[
\begin{array}{cc}
A^{-1}+A^{-1}B(F/A)^{-1}CA^{-1}& ... \\
... & ...
\end{array}\right]$
Write $F$ as $$F = \left[
\begin{array}{cc}
A & B \\
C & D
\end{array}\right] =
\left[
\begin{array}{cc}
I_p & 0 \\
CA^{-1} & I_q
\end{array}\right]
\left[
\begin{array}{cc}
A & 0 \\
0 & D - C A^{-1} B
\end{array}\right]
\left[
\begin{array}{cc}
I_p & A^{-1}B \\
0 & I_q
\end{array}\right]$$
Such that
$$
F^{-1} =
\left[
\begin{array}{cc}
I_p & A^{-1}B \\
0 & I_q
\end{array}\right]^{-1}
\left[
\begin{array}{cc}
A & 0 \\
0 & D - C A^{-1} B
\end{array}\right]^{-1}
\left[
\begin{array}{cc}
I_p & 0 \\
CA^{-1} & I_q
\end{array}\right]^{-1}.
$$
It's easy to check that
$$\left[
\begin{array}{cc}
I_p & 0 \\
CA^{-1} & I_q
\end{array}\right]^{-1}=\left[
\begin{array}{cc}
I_p & 0 \\
-CA^{-1} & I_q
\end{array}\right]$$
and
$$\left[
\begin{array}{cc}
I_p & A^{-1}B \\
0 & I_q
\end{array}\right]^{-1}=\left[
\begin{array}{cc}
I_p & -A^{-1}B \\
0 & I_q
\end{array}\right]$$
and
$$\left[
\begin{array}{cc}
A & 0 \\
0 & D - CA^{-1}B
\end{array}\right]^{-1}=\left[
\begin{array}{cc}
A^{-1} & 0 \\
0 & (D - CA^{-1}B)^{-1}
\end{array}\right].$$
Therefore
$$
F^{-1} =
\left[
\begin{array}{cc}
I_p & -A^{-1}B \\
0 & I_q
\end{array}\right]
\left[
\begin{array}{cc}
A^{-1} & 0 \\
0 & (D - C A^{-1} B)^{-1}
\end{array}\right]
\left[
\begin{array}{cc}
I_p & 0 \\
-CA^{-1} & I_q
\end{array}\right] =
\left[
\begin{array}{cc}
A^{−1} + A^{−1}B (F/A)^{−1}CA^{-1} & −A^{−1}B(F/A)^{−1}\\
−(F/A)^{−1}CA^{−1} & (F/A)^{−1}
\end{array}\right]
$$
where $F/A = D - CA^{-1}B$ .
No. You can't even calculate $E[1/X]$ from $E[X]$ in general for a single random variable, let alone the (co)variance of it.
Example: let $X$ be a Rademacher RV, so its variance is $1$ and the variance of $1/X$ is also $1$ (because $1/X=X$ for this particular distribution).
And, let $Y$ be a RV with possible values $2$, and $-2$, i.e. a slight variation of the Rademacher RV, and let $P(Y=2)=p$. So, the variance of $Y$ is
$$\operatorname{var}(Y)=E[Y^2]-E[Y]^2=4-(2p-2(1-p))^2$$
And, this is equal to $1$ if $p=(2+\sqrt 3)/4$. Now, calculating $\operatorname{var}(1/Y)$:
$$\operatorname{var}(1/Y)=E[1/Y^2]-E[1/Y]^2=1/4-(p/2-(1-p)/2)^2\neq 1$$
when calculated. So, both random variables have the same variance but not the same variance for their reciprocals, which means you can't actually find $\operatorname{var}(1/Y)$ from $\operatorname{var}(Y)$.
Best Answer
There are basically two things to be said. The first is that if you look at the density for the multivariate normal distribution (with mean 0 here) it is proportional to $$\exp\left(-\frac{1}{2}x^T P x\right)$$ where $P = \Sigma^{-1}$ is the inverse of the covariance matrix, also called the precision. This matrix is positive definite and defines via $$(x,y) \mapsto x^T P y$$ an inner product on $\mathbb{R}^p$. The resulting geometry, which gives specific meaning to the concept of orthogonality and defines a norm related to the normal distribution, is important, and to understand, for instance, the geometric content of LDA you need to view things in the light of the geometry given by $P$.
The other thing to be said is that the partial correlations can be read of directly from $P$, see here. The same Wikipedia page gives that the partial correlations, and thus the entries of $P$, have a geometrical interpretation in terms of cosine to an angle. What is, perhaps, more important in the context of partial correlations is that the partial correlation between $X_i$ and $X_j$ is 0 if and only if entry $i,j$ in $P$ is zero. For the normal distribution the variables $X_i$ and $X_j$ are then conditionally independent given all other variables. This is what Steffens book, that I referred to in the comment above, is all about. Conditional independence and Graphical models. It has a fairly complete treatment of the normal distribution, but it may not be that easy to follow.