[Math] Correlation between 3 variables

data analysisst.statistics

For correlation measurement betweeen 2 variables, I use Pearson formula.

What formula can use to find degree of correlation between 3 variables ? My variabes are not symmetric: The correlation in question is between 1st variable and pair of the other two. But I don't have a formula to combine 2nd and 3rd into one variable. Variables have values -1, 0, 1, if it matters.

Best Answer

Maybe you need the theory of cumulants also called semi-invariants. For two random variables $X,Y$ the correlation (or second cumulant) is $v(X,Y)=E(XY)-E(X)E(Y)$ where $E$ denotes the expectation. Pearson's formula makes a dimensionless quantity $$r=\frac{v(X,Y)}{\sqrt{v(X,X) v(Y,Y)}}\ ,$$ i.e., $X$ and $Y$ might have units like centimeters but $r$ is a pure number. The third cumulant generalizes $v(X,Y)$ and measures a correlation of three variables `altogether', i.e., not indirectly resulting from their pairwise correlations. It is $$ c(X,Y,Z)=E(XYZ)-E(X)E(YZ)-E(Y)E(XZ)-E(Z)E(XY) $$ $$ +2E(X)E(Y)E(Z). $$ However I don't know what the natural or standard dimensionless analog of $r$ would be. A possibility is $$ \frac{c(X,Y,Z)}{\sqrt{v(X,X)v(Y,Y)v(Z,Z)}}. $$ All this is about random variables, say discrete given by a finite sample $(x_i,y_i,z_i)$, $1\le i\le N$. Now in statistical estimation you might have things like $1/N$ turning into $1/(N-1)$ in the correct formulas to use.

Related Question