Solved – Different F-ratios for within subjects effects when using SPSS and R’s aov

anovarrepeated measuresspss

I've just compared the ANOVA tables generated by SPSS and Statistica with the aov table provided by summary(aov.model). They yield identical between Subject effects (e.g., NativeLanguage(English vs Other), but different F ratios for within subject effects (e.g., Class(animate vs inanimate), aov F-ratios being consistently smaller and more conservative. The interaction of the within-between factors again yields identical terms. I am stumped. HOw can this be? Any suggestion? Here are more details:

I was analyzing the RTs lexdec data base from the languageR package. I did some minor data adjustments. E.g., To get a sense for RTs I reversed the log transform exp(lexdec$RT) then removed Error responses and RT outliers. Using ddply I obtained condition means for NativeLanguage and Class for each subject (the data frame is shown at the bottom ob my post). Analyzing these data, I obtained different aov and STATISTICA summaries. Specifically for the within-subjects factor Class was p~.18 with aov and p~.06 with STATISTICA and SPSS, with larger Class SS values shown by the two commercial packages (569) than by aov (301).

I've tried to make the two anova outputs (shown below) look transparent but it appears that the posting format does not match the format shown in the question window.

> C1.anova <- aov(RT ~ (Class * NativeLanguage)
+  + Error(Subject/Class) + (NativeLanguage), data=C1 )
> summary(C1.anova)

Error: Subject
               Df Sum Sq Mean Sq F value  Pr(>F)  
NativeLanguage  1  81413   81413  6.2973 0.02131 *
Residuals      19 245637   12928                  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Error: Subject:Class
                     Df  Sum Sq Mean Sq F value   Pr(>F)   
Class                 1  301.51  301.51  1.9661 0.176994   
Class:NativeLanguage  1 2175.86 2175.86 14.1880 0.001305 **
Residuals            19 2913.83  153.36                    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

STATISTICA

                       SS  DF        MS         F          p
Intercept        15387240   1  15387240  1190.202   0.000000
NativeLanguage      81413   1     81413     6.297   0.021311
Error              245637  19     12928     
Class                 569   1       569     3.709   0.069217
Class*NativeLanguage 2176   1      2176    14.188   0.001305
Error                2914  19       153

Data:

   Subject  Class NativeLanguage       RT
1       A1 animal        English 557.6410
2       A1  plant        English 548.4687
3       A2 animal        English 533.4737
4       A2  plant        English 511.7941
5       A3 animal          Other 598.9545
6       A3  plant          Other 602.4118
7        C animal        English 562.8864
8        C  plant        English 560.0588
9        D animal          Other 630.1464
10       D  plant          Other 604.0286
11       I animal          Other 542.1219
12       I  plant          Other 533.1666
13       J animal          Other 565.4324
14       J  plant          Other 513.2333
15       K animal        English 492.4500
16       K  plant        English 517.7333
17      M1 animal        English 481.8372
18      M1  plant        English 497.9687
19      M2 animal          Other 671.6666
20      M2  plant          Other 655.8750
21       P animal          Other 640.7209
22       P  plant          Other 610.0286
23      R1 animal        English 552.9744
24      R1  plant        English 545.4242
25      R2 animal        English 636.8864
26      R2  plant        English 675.1714
27      R3 animal        English 607.8572
28      R3  plant        English 614.9428
29       S animal        English 599.9285
30       S  plant        English 586.6286
31      T1 animal        English 580.0500
32      T1  plant        English 583.2857
33      T2 animal          Other 892.5526
34      T2  plant          Other 862.1000
35       V animal          Other 736.2619
36       V  plant          Other 718.3529
37      W1 animal        English 517.0465
38      W1  plant        English 539.2727
39      W2 animal        English 639.1363
40      W2  plant        English 666.7143
41       Z animal          Other 725.3750
42       Z  plant          Other 706.2069

Best Answer

This may be because your between-groups variable, NativeLanguage, is unbalanced (12 English, 9 Other), in which case the type of Sums-of-Squares employed is going to affect the F values. By default, aov() uses Type 1 sums of squares, which isn't recommended with unbalanced designs. Instead, use the ezANOVA() function from the ez package:

my_anova = ezANOVA(
    data = C1
    , dv = .(RT)
    , wid = .(Subject)
    , within = .(Class)
    , between = .(NativeLanguage)
    , type = 3 #SPSS uses type 3 Sums-of-Squares
    , observed = .(NativeLanguage) #ensures appropriate effect size is computed
)
#note warning about data imbalance
print(my_anova)

This yields the results table:

$ANOVA
                Effect DFn DFd         F           p p<.05          ges
2       NativeLanguage   1  19  6.297322 0.021311506     * 0.2451177279
3                Class   1  19  1.976662 0.175885891       0.0009118545
4 NativeLanguage:Class   1  19 14.187926 0.001305329     * 0.0065510116

Which strangely still has a different report for the Class effect than what you're getting from SPSS/Statistica. Adding a detailed=TRUE argument to the ezANOVA() call above gives us a slightly more detailed results table (including the intercept and sums of squares):

$ANOVA
                Effect DFn DFd          SSn        SSd          F            p p<.05          ges
1          (Intercept)   1  19 7717585.5514 245636.967 596.954632 8.136647e-16     * 0.9587389575
2       NativeLanguage   1  19   81413.4195 245636.967   6.297322 2.131151e-02     * 0.2451177279
3                Class   1  19     303.1399   2913.831   1.976662 1.758859e-01       0.0009118545
4 NativeLanguage:Class   1  19    2175.8535   2913.831  14.187926 1.305329e-03     * 0.0065510116

This shows that the mismatch lies in the SSn for the class effect; ezANOVA (which uses car::Anova()) obtains an SSn of 303ish whereas SPSS/Statistica obtain an SSn of 569.

Related Solutions

Solved – F-test differences Stata and R

update: I´ve crossposted the question at statalist.org and got an answer there: http://www.statalist.org/forums/forum/general-stata-discussion/general/1348073-f-test-differences-stata-and-r

basically the problem was that the contrast settings differed between R and Stata & I did not compute the same SS type.

By default Stata computes type 3 SS, but I specified type 2 SS in R. But when computing type 3 SS in R, you should NOT use the default contrasts (contr.treatment), but instead use some orthogonal contrast (like contr.sum), see this link:
http://www.mail-archive.com/r-help@s.../msg69781.html

Thus when I did the ANOVA with type 3 SS and contr.sum, I got the same output as in Stata where I didn´t specify anything.

Solved – Different results obtained with lmer() and aov() for three-way repeated-measures experiment

Using Jake Westfall's EMSfunction documented HERE, I estimated the variance components from the mixed model. The random variances above that were estimated to be 0 are indeed negative as derived by EMS.

cbind(rev(ans[c(grep("e",names(ans)),grep("s",names(ans)))])/
    c(1,2,2,2,4,4,4,5,1))


s        69.838600
a:s       7.711213
b:s      -1.423696
c:s      -1.098723
a:b:s    -1.498620
a:c:s    -1.212590
b:c:s    -1.244919
a:b:c:s  -2.650683
e       922.447967

When running lmer.4 on the aggregated data-set, lmergives identical results (except for the 3-way interaction, but it is still very similar).

> lmer.4 <- lmer(rating ~ timepoint_n*att_cond_n*gng_cond_n +(1|subjectID) + (0 + timepoint_n | subjectID) 
                                  (0 + att_cond_n | subjectID) + (0+gng_cond_n |subjectID) +  (0+timepoint_n:att_cond_n|subjectID) + 
                                  (0+timepoint_n:gng_cond_n|subjectID) + (0 + att_cond_n:gng_cond_n|subjectID), 
                                 data = d_aggr, control=lmerControl(optCtrl=list(maxfun=1e9)))
> summary(lmer.4)
Linear mixed model fit by REML 
t-tests use  Satterthwaite approximations to degrees of freedom ['lmerMod']
Formula: rating ~ timepoint_n * att_cond_n * gng_cond_n + (1 | subjectID) +  
  (0 + timepoint_n | subjectID) + (0 + att_cond_n | subjectID) +  
  (0 + gng_cond_n | subjectID) + (0 + timepoint_n:att_cond_n |  
                                    subjectID) + (0 + timepoint_n:gng_cond_n | subjectID) + (0 +      att_cond_n:gng_cond_n | subjectID)
Data: d_aggr
Control: lmerControl(optCtrl = list(maxfun = 1e+09))

REML criterion at convergence: 2438.7

Scaled residuals: 
  Min      1Q  Median      3Q     Max 
-4.0726 -0.1788 -0.0138  0.2457  4.3407 

Random effects:
  Groups      Name                   Variance  Std.Dev. 
subjectID   (Intercept)            2.850e+02 1.688e+01
subjectID.1 timepoint_n            3.651e+01 6.042e+00
subjectID.2 att_cond_n             2.257e-11 4.750e-06
subjectID.3 gng_cond_n             1.269e+00 1.126e+00
subjectID.4 timepoint_n:att_cond_n 0.000e+00 0.000e+00
subjectID.5 timepoint_n:gng_cond_n 8.132e-01 9.018e-01
subjectID.6 att_cond_n:gng_cond_n  6.839e-01 8.270e-01
Residual                           4.694e+01 6.851e+00
Number of obs: 328, groups:  subjectID, 41

Fixed effects:
                                   Estimate   Std. Error      df  t value Pr(>|t|)    
(Intercept)                       140.88293    2.66360  40.00000  52.892  < 2e-16 ***
timepoint_n                        -6.92561    1.01664  40.00000  -6.812 3.43e-08 ***
att_cond_n                         -0.13354    0.37828 120.00000  -0.353  0.72470    
gng_cond_n                          1.35854    0.41718  40.00000   3.256  0.00230 ** 
timepoint_n:att_cond_n             -0.02744    0.37828 120.00000  -0.073  0.94230    
timepoint_n:gng_cond_n              1.36220    0.40365  40.00000   3.375  0.00165 ** 
att_cond_n:gng_cond_n               1.00549    0.39972  40.00000   2.515  0.01601 *  
timepoint_n:att_cond_n:gng_cond_n   0.98963    0.37828 120.00000   2.616  0.01004 *  
  ---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Correlation of Fixed Effects:
(Intr) tmpnt_ att_c_ gng_c_ tmpnt_n:t__ tmpnt_n:g__ a__:__
timepoint_n 0.000                                                     
att_cond_n  0.000  0.000                                              
gng_cond_n  0.000  0.000  0.000                                       
tmpnt_n:t__ 0.000  0.000  0.000  0.000                                
tmpnt_n:g__ 0.000  0.000  0.000  0.000  0.000                         
att_cnd_:__ 0.000  0.000  0.000  0.000  0.000       0.000             
tmpn_:__:__ 0.000  0.000  0.000  0.000  0.000       0.000       0.000

Therefore, it seems that as lmer cannot achieve the fit of aov by not being able/allowed to estimate negative variance components, the results differ, while the difference is gone when running the lmer representation of the model on the aggregated data, eliminating the negatively estimated variance components.

Best Answer

Related Solutions

Solved – F-test differences Stata and R

Solved – Different results obtained with lmer() and aov() for three-way repeated-measures experiment

Related Question