Solved – Why is the shared variance negative

multiple regressionregressionregression coefficientsvariance

I have two questions regarding standard multiple regression:

  1. Why is my shared variance a negative number?

  2. Should I only include the positive semipartial correlations when calculating uniquely explained variance?

I am trying to calculate the amount of shared variance explained in a regression model with four predictor variables, and this number is coming out negative (-.465).

According to Tabachnick & Fidell, (2001), uniquely explained variance is computed by adding up the squared semipartial correlations. Shared variance is computed by subtracting the uniquely explained variance from the R square.

My R square value = .325 (R = .570, Adj R square = .295, fwiw)

F(4, 91) = 10.941, p = .0005.

The semipartial correlation values are (significant predictors indicated by*, from the ‘Part’ column in SPSS output):

.172*
-.174*
.465*
.164

I calculated that the predictors collectively uniquely explained 80% of the variance (0.801, which is the sum of POSITIVE semipartial correlation coefficients).

When calculating the shared variance, the figure comes out at -.48 (computed by subtracting the uniquely explained variance from the R square value; .32 - .80 = -.48). I’m not sure whether it is possible to have a negative value for shared variance, where have I gone wrong (if I have)?

Any advice would be greatly received.

Many thanks.

Best Answer

This is why it's sooo much more intuitive to think in terms of regression coefficients, which have units, instead of abstract and confusing partial correlations. Luckily inference as to whether these effects are 0 are equivalent, and they always agree in sign. Positive associations, positive regression coefficients, and positive partial correlations: likewise for negative. Observing a negative coefficient in a multivariate regression model is frequently observed, even when it has a positive marginal association with the outcome of interest.

By "shared variance" I assume you mean "covariance". Two random variables, X and Y, have a negative covariance when the conditional mean function is negative, i.e. a unit difference in X is associated with a decrease in Y. This type of interpretation is much more cogent for statistical inference than the interpretations of "variance explained": even when modeling assumptions are met, and even when experimental conditions are controlled, such interpretations are not remotely correct.