The first definition is a special case of the second. Rather than a Hilbert space H, let's look at a Banach space $X$, and distinguish it from its dual space $X^*$. Every continuous linear functional $\varphi \in X^*$ is a random variable $\varphi : X \to \mathbb R$, so it makes sense to take expectations and covariances. We define the expectation of a functional by $\mathbb E[\varphi] = \int_X \varphi[x] \, \mathrm d \mathbf P(x)$, and the covariance of two functionals to be
$$\operatorname{cov}[\psi|\varphi] = \int_X \big( \psi[x] - \mathbb E[\psi]\big) \big( \varphi[x] - \mathbb E[\varphi]\big) \, \mathrm d \mathbf P(x).$$
Now, consider a probability measure $\mathbf P$ on a Hilbert space $X = H$. By the Riesz representation theorem, we know that the dual space $H^*$ is isomorphic to $H$, and all the functionals are of the form $\varphi_h[x] := \langle h, x \rangle$.
The mean can be represented by a single element $m \in H$, which is called the "Pettis integral" of $\mathbf P$. This element satisfies the property that $\mathbb E[\varphi_h] = \varphi_h[m] = \langle h, m \rangle$ for all $h \in H$.
Consequently,
$$\operatorname{cov}[\varphi_h|\varphi_{h'}] = \int_X \big\langle h, x - m \big\rangle \big\langle h', x-m \big\rangle \, \mathrm d \mathbf P(x).$$
This is the formula you were looking for. It's just a special case of the usual covariance formula, specialized to the setting where the random variables of interest are continuous linear functionals, and the probability space is a Hilbert space.
Yes that is true.
If $X$ and $Y$ are independent it implies their correlation, and therefore covariance are zero (but the converse is not necessarily true).
Assuming this knowledge, then by squaring the RVs separately we are not inducing any new correlational information, and thus they still remain uncorrelated and independent.
In this way we conclude $Cov(X,Y) = Cov(X^2,Y^2) = 0$.
Best Answer
You can't, but you can bound it. First, you bound $\text{cor}(X, Z)$ as a function of the two other correlations. The reasoning behind the bound is that the correlation matrix of the vector $(X, Y, Z)$ must be positive semidefinite. Without repeating the algebra, you write down the determinant of the $3\times 3$ correlation matrix and impose that it be nonnegative. This is a quadratic inequality in $\rho_{XZ}$. It is is satisfied when is $$ |\rho_{XZ} - \rho_{XY}\rho_{YZ}| \le \sqrt{(1- \rho_{XY}^2)(1- \rho_{YZ}^2)} $$ The inequality below is the one used most often. Now, onto the original question. You mentioned that the marginal distributions of the three variables are known, hence their standard deviations are (as well as the correlations $\rho_{XY}, \rho_{YZ}$). Hence you can rewrite the bound as $$ \left| \frac{\text{cov}(X,Z)}{\sigma_X\sigma_Z} - \frac{\text{cov}(X,Y)\text{cov}(Y,Z)}{\sigma_X\sigma_Y^2\sigma_Z} \right| \le \sqrt{(1- \rho_{XY}^2)(1- \rho_{YZ}^2)} $$ or $$ \left|\text{cov}(X,Z) - \text{cov}(X,Y)\text{cov}(Y,Z)\frac{1}{\sigma_Y^2} \right| \le \sigma_X\sigma_Z \sqrt{(1- \rho_{XY}^2)(1- \rho_{YZ}^2)} $$