I have 2 dependent variables (DVs) each of whose score may be influenced by the set of 7 independent variables (IVs). DVs are continuous, while the set of IVs consists of a mix of continuous and binary coded variables. (In code below continuous variables are written in upper case letters and binary variables in lower case letters.)
The aim of the study is to uncover how these DVs are influenced by IVs variables. I proposed the following multivariate multiple regression (MMR) model:
my.model <- lm(cbind(A, B) ~ c + d + e + f + g + H + I)
To interpret the results I call two statements:
summary(manova(my.model))
Manova(my.model)
Outputs from both calls are pasted below and are significantly different. Can somebody please explain which statement among the two should be picked to properly summarize the results of MMR, and why? Any suggestion would be greatly appreciated.
Output using summary(manova(my.model))
statement:
> summary(manova(my.model))
Df Pillai approx F num Df den Df Pr(>F)
c 1 0.105295 5.8255 2 99 0.004057 **
d 1 0.085131 4.6061 2 99 0.012225 *
e 1 0.007886 0.3935 2 99 0.675773
f 1 0.036121 1.8550 2 99 0.161854
g 1 0.002103 0.1043 2 99 0.901049
H 1 0.228766 14.6828 2 99 2.605e-06 ***
I 1 0.011752 0.5887 2 99 0.556999
Residuals 100
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Output using Manova(my.model)
statement:
> library(car)
> Manova(my.model)
Type II MANOVA Tests: Pillai test statistic
Df test stat approx F num Df den Df Pr(>F)
c 1 0.030928 1.5798 2 99 0.21117
d 1 0.079422 4.2706 2 99 0.01663 *
e 1 0.003067 0.1523 2 99 0.85893
f 1 0.029812 1.5210 2 99 0.22355
g 1 0.004331 0.2153 2 99 0.80668
H 1 0.229303 14.7276 2 99 2.516e-06 ***
I 1 0.011752 0.5887 2 99 0.55700
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Best Answer
Briefly stated, this is because base-R's
manova(lm())
uses sequential model comparisons for so-called Type I sum of squares, whereascar
'sManova()
by default uses model comparisons for Type II sum of squares.I assume you're familiar with the model-comparison approach to ANOVA or regression analysis. This approach defines these tests by comparing a restricted model (corresponding to a null hypothesis) to an unrestricted model (corresponding to the alternative hypothesis). If you're not familiar with this idea, I recommend Maxwell & Delaney's excellent "Designing experiments and analyzing data" (2004).
For type I SS, the restricted model in a regression analysis for your first predictor
c
is the null-model which only uses the absolute term:lm(Y ~ 1)
, whereY
in your case would be the multivariate DV defined bycbind(A, B)
. The unrestricted model then adds predictorc
, i.e.lm(Y ~ c + 1)
.For type II SS, the unrestricted model in a regression analysis for your first predictor
c
is the full model which includes all predictors except for their interactions, i.e.,lm(Y ~ c + d + e + f + g + H + I)
. The restricted model removes predictorc
from the unrestricted model, i.e.,lm(Y ~ d + e + f + g + H + I)
.Since both functions rely on different model comparisons, they lead to different results. The question which one is preferable is hard to answer - it really depends on your hypotheses.
What follows assumes you're familiar with how multivariate test statistics like the Pillai-Bartlett Trace are calculated based on the null-model, the full model, and the pair of restricted-unrestricted models. For brevity, I only consider predictors
c
andH
, and only test forc
.For comparison, the result from
car
'sManova()
function using SS type II.Now manually verify both results. Build the design matrix $X$ first and compare to R's design matrix.
Now define the orthogonal projection for the full model ($P_{f} = X (X'X)^{-1} X'$, using all predictors). This gives us the matrix $W = Y' (I-P_{f}) Y$.
Restricted and unrestricted models for SS type I plus their projections $P_{rI}$ and $P_{uI}$, leading to matrix $B_{I} = Y' (P_{uI} - P_{PrI}) Y$.
Restricted and unrestricted models for SS type II plus their projections $P_{rI}$ and $P_{uII}$, leading to matrix $B_{II} = Y' (P_{uII} - P_{PrII}) Y$.
Pillai-Bartlett trace for both types of SS: trace of $(B + W)^{-1} B$.
Note that the calculations for the orthogonal projections mimic the mathematical formula, but are a bad idea numerically. One should really use QR-decompositions or SVD in combination with
crossprod()
instead.