Partial Correlation – Understanding the Meaning and Applications

conditioningcorrelationpartial-correlation

From Wikipedia

Formally, the partial correlation between $X$ and $Y$ given a set of $n$ controlling variables $Z = \{Z_1, Z_2, …, Z_n\}$, written $ρ_{XY·Z}$, is the correlation between the residuals $RX$ and $RY$ resulting from the linear regression of $X$ with $Z$ and of $Y$ with $Z$, respectively.

  1. It says earlier that

    partial correlation measures the degree of association between two random variables, with the effect of a set of controlling random
    variables removed.

    I was wondering how the partial correlation $ρ_{XY·Z}$ is related to
    the correlation between $X$ and $Y$ conditional on $Z$?

  2. There is a special case for $n=1$.

    In fact, the first-order partial correlation (i.e. when $n=1$) is nothing else than a difference between a correlation and the product
    of the removable correlations divided by the product of the
    coefficients of alienation of the removable correlations. The
    coefficient of alienation, and its relation with joint variance
    through correlation are available in Guilford (1973, pp. 344–345).

    I was wondering how to write the above down mathematically?

Best Answer

Note that correlation conditional on $Z$ is a variable that depends on $Z$, whereas partial correlation is a single number.

Furthermore, partial correlation is defined based on the residuals from linear regression. Thus, if the actual relationship is nonlinear, the partial correlation may obtain a different value than the conditional correlation, even if the correlation conditional on $Z$ is a constant independent of $Z$. On the other hand, it $X,Y,X$ are multivariate Gaussian, the partial correlation equals the conditional correlation.

For an example where constant conditional correlation $\neq$ partial correlation: $$Z\sim U(-1,1),~X=Z^2+e,~Y=Z^2-e,~e\sim N(0,1),e\perp Z.$$ No matter which value $Z$ takes, the conditional correlation will be -1. However, the linear regressions $X|Z$,$Y|Z$ will be constants 0, and thus the residuals will be the values $X$,$Y$ themselves. Thus, the partial correlation equals the correlation between $X$,$Y$; which does not equal -1, as clearly the variables are not perfectly correlated if $Z$ is not known.

Apparently, Baba and Sibuya (2005) show the equivalence of partial correlation and conditional correlation for some other distributions besides multivariate Gaussian, but I did not read this.

The answer to your question 2 seems to exist in the Wikipedia article, the second equation under Using recursive formula.

Related Question