Inferring constraints on $B$ given $Y = XB$ and correlations matrix $\Sigma_{YX} = \Sigma_X B$

covariancelinear algebralinear-transformationsstatistical-inferencestatistics

Say we have a simple linear model (ignoring any error term) such that $Y = XB$, where Y is a vector of a single outcome (of $m$ predicted values), the product of an $m \times n$ matrix $X$ (having $m$ observations of random variables $x_1 … x_n$) times a vector of weights $B$.

Where $n = 3$, we get the following covariance matrix for $X$, which (as a representation of variance) must be positive semidefinite.

$$ \Sigma_X = \left[\begin{array}{ccc} 1 & cov(x_1, x_2) & cov(x_1, x_3) \\ cov(x_1, x_2) & 1 & cov(x_2, x_3) \\ cov(x_1, x_3) & cov(x_2, x_3) & 1 \end{array}\right] $$

Further, given the definition of $Y$, the covariance matrix between $Y$ and $X$ is

$$ \Sigma_{YX} = \Sigma_X B = \left[\begin{array}{ccc} 1 & cov(x_1, x_2) & cov(x_1, x_3) \\ cov(x_1, x_2) & 1 & cov(x_2, x_3) \\ cov(x_1, x_3) & cov(x_2, x_3) & 1 \end{array}\right] \cdot \left[\begin{array}{c} b_1 \\ b_2 \\ b_3 \end{array}\right] = \left[\begin{array}{c} cov(y, x_1) \\ cov(y, x_2) \\ cov(y, x_3) \end{array}\right] $$

It follows that $B = \Sigma_X^{-1} \Sigma_{XY}$. Scaling covariance and making the correlations partitions matrix, we get

$$ P = \left[\begin{array}{cc} \Sigma_Y & \Sigma_{YX}' \\ \Sigma_{YX} & \Sigma_X \end{array}\right] =
\left[\begin{array}{cccc} 1 & cor(y, x_1) & cor(y, x_2) & cor(y, x_3) \\ cor(y, x_1) & 1 & cor(x_1, x_2) & cor(x_1, x_3) \\ cor(y, x_2) & cor(x_1, x_2) & 1 & cor(x_2, x_3) \\ cor(y, x_3) & cor(x_1, x_3) & cor(x_2, x_3) & 1 \end{array}\right] $$

At this point, my understanding is, we can't choose any $B$, because some choices could violate the cross-correlations in the partitions matrix $P$, i.e. render it NOT positive semidefinite. From intuition (and also this helpful post) we have that, for instance, $cor(y, x_1)$ and $cor(x_1, x_2)$ put bounds on $cor(y, x_2)$.

My question is: can I do better than a step-wise algorithm to set $cor(y, x_i)$s one at a time, checking that all the 'correlation triangles' make sense? Is there a way to calculate the constraints on $B$ which renders a valid (positive semidefinite) $P$, given a $\Sigma_X$ which already has consistent correlations, i.e. is positive semidefinite? We can assume everything is Gaussian if that simplifies things.

Another way of phrasing the problem is:

Which $B$ ensure that $P$ is positive semidefinite, given that $\Sigma_X$ is positive semidefinite, and where
$$ P = \left[\begin{array}{cc} 1 & (\Sigma_{X}B)' \\ \Sigma_{X}B & \Sigma_X \end{array}\right]?$$

Best Answer

It is deduced from the context that transpose is denoted by a prime, and I answer in terms of the final phrasing.

Answer: $P$ is positive semi-definite (PSD) iff $B'\Sigma_X B \leq 1$. Note that $0 \leq B'\Sigma_X B$ always holds.

Since $\Sigma_X$ is PSD, there is a unique PSD matrix $M$ such that $M^2 = \Sigma_X$; this is due to the eigen-decomposition of $\Sigma_X$. By the eigen-decomposition of $M$, we also see that $\operatorname{ran}(M) = \operatorname{ran}(\Sigma_X)$. Let $b = MB$. To say that $P$ is PSD is to say that $$\begin{align} & \forall \alpha \in \mathbb{R} \forall A \in \mathbb{R^n}: 0\leq \begin{bmatrix}\alpha & A'\end{bmatrix} \begin{bmatrix} 1 & (\Sigma_{X}B)' \\ \Sigma_{X}B & \Sigma_X \end{bmatrix} \begin{bmatrix}\alpha \\ A\end{bmatrix} \\\iff& \forall \alpha \in \mathbb{R} \forall A \in \mathbb{R^n}: 0\leq \begin{bmatrix}\alpha & A'\end{bmatrix} \begin{bmatrix} 1 & B'M^2 \\ M^2B & MM \end{bmatrix} \begin{bmatrix}\alpha \\ A\end{bmatrix} \\\iff& \forall \alpha \in \mathbb{R} \forall A \in \mathbb{R^n}: 0\leq \begin{bmatrix}\alpha & A'M\end{bmatrix} \begin{bmatrix} 1 & B'M \\ MB & I \end{bmatrix} \begin{bmatrix}\alpha \\ MA\end{bmatrix} \\\iff& \forall \alpha \in \mathbb{R} \forall a \in \operatorname{ran}(M): a = MA \quad \And \quad 0\leq \begin{bmatrix}\alpha & a'\end{bmatrix} \begin{bmatrix} 1 & b' \\ b & I \end{bmatrix} \begin{bmatrix}\alpha \\ a\end{bmatrix} \\\iff& \forall \alpha \in \mathbb{R} \forall a \in \operatorname{ran}(M): 0\leq \alpha^2 + \alpha a'b + \alpha b'a + a'a \\\iff& \forall \alpha \in \mathbb{R} \forall a \in \operatorname{ran}(M): 0\leq (\alpha + a'b)^2 - (a'b)^2 + a'a \\\iff& \forall a \in \operatorname{ran}(M): (a'b)^2 - a'a \leq \inf_{\alpha \in \mathbb{R}}(\alpha + a'b)^2 = 0 \\\iff& \forall \beta \in \mathbb{R} \forall u \in \operatorname{ran}(M) \cap\{b\}^\perp: a = \beta b + u \quad\And\quad (a'b)^2 \leq a'a \\\iff& \forall \beta \in \mathbb{R} \forall u \in \operatorname{ran}(M) \cap\{b\}^\perp: \beta^2(b'b)^2 \leq \beta^2 b'b + u'u \\\iff& \forall \beta \in \mathbb{R}: \beta^2(b'b)^2 - \beta^2 b'b \leq \inf_{u \in \operatorname{ran}(M) \cap\{b\}^\perp} u'u = 0 \\\iff& \forall \beta \in \mathbb{R}: \beta^2 b'b (b'b - 1) \leq 0 \\\iff& 0 \leq b'b \leq 1 \\\iff& 0 \leq B'\Sigma_X B \leq 1 \\\iff& B'\Sigma_X B \leq 1 \end{align}$$

Geometric interpretation: By the eigen-decomposition of $\Sigma_X$, we can further observe that $\ker(\Sigma_X)^\perp = \operatorname{ran}(\Sigma_X) = \operatorname{span}\{v_1,\dots,v_k\}$ such that $v_1,\dots,v_k$ are unit eigenvectors whose corresponding eigenvalues (with multiplicity) $\lambda_1,\dots,\lambda_k$ are non-zero. Then, $B \in \mathbb{R}^n$ can be uniquely expressed as $u + \sum_{i=1}^k b_i v_i$ with $u\in\ker(\Sigma_X)$. Thus, $1 \geq B'\Sigma_X B = \sum_{i=1}^k \lambda_i b_i^2$ marks the direct-sum of the subspace $\ker(\Sigma_X)$ and the solid ellipsoid with semi-axes $1/\sqrt{\lambda_i}$ pointing in the $v_i$ directions.

Related Question