How do you measure the ‘explained variance’ of arbitrary linear embeddings

Question Elaboration:
When I say 'linear embeddings' I mean a lower-dimensional representation of variables resulting from an arbitrary linear transformation. And when I say 'explained variance' I mean in the sense that 'explained variance' is measured for PCA's principal components.

Context of Confusion:
I realized I'm confused about exactly what this means because while on the surface the idea of 'explained variance' from lower-dimensional encodings makes sense. I've come to wonder if rigorously speaking it is at all possible for anything but a model to 'explain variance' of original variables (i.e. R^2 measure)…

Attempt 1 to draw connection to PCA:
For example This website says that total variance for PCA is sum of individual variances of the PCA variables, and then explained variance for each PC is portion of that total variance present in a PC. I don't see how this could generalize to arbitrary lower dimensional representations because you could just scale some of the lower dimensional variables arbitrarily and change which represent more of the variance no?

Attempt 2 to draw connection to PCA:
I've also seen on numerous websites (e.g. here) that the explained variance for a PCA variable is the corresponding eigen-value but I'm not sure what the analogue to arbitrary lower-dimensional linear representations would be…

Best Answer

I think you are right that the notion of "explained variance" does not make very much sense without some kind of inverse mapping from the embedding back into the original space. I suggest fitting a linear regression model $y_i\sim \beta x_i$ where $y_i$ are the original points and $x_i$ are the corresponding lower-dimensional embedding points. Then you can compare the variance of $y_i$ to the variance of the predictions $\beta x_i$.

Note that if the points $x_i$ are obtained through PCA, then $\beta$ ends up being the transpose of the projection map, and you recover the normal formulas for variance explained.

Best Answer

Related Solutions

[Math] The inverse variance is proportional to the sample size

[Math] Looking for a proof of : variance of sum is the sum of variances.

Related Question