Solved – F-test differences Stata and R

anovaf statisticrstata

I have a question about what the difference is in how Stata and R compute ANOVAs. I have run exactly the same ANOVA in both softwares, but curiously get a different F-statistics for one of the predictors. I´m not too familiar with Stata, but as far as I understood it, I do a Type 2 SS ANOVA for both.

To understand my output, this is my model:
Outcome variable is a continuous variable called vertrauen (=trust)
predictor 1 is a 2-level factor called trustee in R and Goodguy in Stata
predictor 2 is also a 2 level factor called Group in R and uw in Stata.

This is the R output:

 
>m2-lm(vertrauen~trustee*Group,data=RTG.UWD.short.50)
> Anova(m2,type="2")
>Anova Table (Type II tests)

>Response: vertrauen
>              Sum Sq Df F value    Pr(>F)      
>trustee       2.4928  1 24.5497    1.367e-05 ***  
>Group         0.0030  1  0.0292    0.8651      
>trustee:Group 0.1137  1  1.1200    0.2963      
>Residuals     4.0617 40                        
>  
>Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1  
>

This is the Stata output:

. anova vertrauen uw Goodguy uw#Goodguy

                         Number of obs =         44    R-squared     =  0.3912
                         Root MSE      =    .318658    Adj R-squared =  0.3455

                  Source | Partial SS         df         MS        F    Prob>F
              -----------+----------------------------------------------------
                   Model |  2.6095358          3   .86984526      8.57  0.0002
                         |
                      uw |  .00296733          1   .00296733      0.03  0.8651
                 Goodguy |  1.2981586          1   1.2981586     12.78  0.0009
              uw#Goodguy |  .11373073          1   .11373073      1.12  0.2963
                         |
                Residual |  4.0617062         40   .10154266  
              -----------+----------------------------------------------------
                   Total |   6.671242         43   .15514516  

As you can see, the F-statistics for the Group (UW) main effect and for the Group (UW) x trustee (Goodguy) interaction are the same, but for the trustee (Goodguy) main effect they differ. In R it´s almost twice as high as in Stata. I tried to change the order of the predictor and the reference levels, but that didn´t change my R output.

Does anyone know what causes the difference in the F-statistic here? I´m really puzzled about it. I expected it to be the same.

Here is the Stata output without the interaction:

. anova vertrauen uw Goodguy

                         Number of obs =         44    R-squared     =  0.3741
                         Root MSE      =    .319124    Adj R-squared =  0.3436

                  Source | Partial SS         df         MS        F    Prob>F
              -----------+----------------------------------------------------
                   Model |   2.495805          2   1.2479025     12.25  0.0001
                         |
                      uw |  .00296733          1   .00296733      0.03  0.8653
                 Goodguy |  2.4928377          1   2.4928377     24.48  0.0000
                         |
                Residual |   4.175437         41   .10183993  
              -----------+----------------------------------------------------
                   Total |   6.671242         43   .15514516  

And here is the R output without the interaction:

> m2.4-lm(vertrauen~trustee+Group,data=RTG.UWD.short.50)
> Anova(m2.4)
Anova Table (Type II tests)

Response: vertrauen
          Sum Sq Df F value    Pr(>F)    
trustee   2.4928  1 24.4780 1.328e-05 ***
Group     0.0030  1  0.0291    0.8653    
Residuals 4.1754 41                      
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> 

It´s the same, thus it has to do something with how the two softwares incorporate the interaction term.

I also tried to manually compute the interaction term and found something interesting:

Here is the R output:

RTG.UWD.short.50$interaction-as.numeric(RTG.UWD.short.50$trustee)*as.numeric(RTG.UWD.short.50$Group)
> m2.7 Anova(m2.7)
Anova Table (Type II tests)

Response: vertrauen
            Sum Sq Df F value    Pr(>F)    
trustee     1.2982  1 12.7844 0.0009316 ***
Group       0.0030  1  0.0292 0.8651282    
interaction 0.1137  1  1.1200 0.2962617    
Residuals   4.0617 40                      
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> 

And here is the Stata output:

. gen interaction=uw*Goodguy

. anova vertrauen uw Goodguy interaction

                         Number of obs =         44    R-squared     =  0.3912
                         Root MSE      =    .318658    Adj R-squared =  0.3455

                  Source | Partial SS         df         MS        F    Prob>F
             ------------+----------------------------------------------------
                   Model |  2.6095358          3   .86984526      8.57  0.0002
                         |
                      uw |   .0399785          1    .0399785      0.39  0.5339
                 Goodguy |  2.3984067          1   2.3984067     23.62  0.0000
             interaction |  .11373073          1   .11373073      1.12  0.2963
                         |
                Residual |  4.0617062         40   .10154266  
             ------------+----------------------------------------------------
                   Total |   6.671242         43   .15514516

Thus it seems that there is a difference in how R/ Stata computes the interactions. The R output of the manually computed interaction matches the automatically computed interaction output in Stata.

And finally the descriptives from R:

> describe(RTG.UWD.short.50$vertrauen)
RTG.UWD.short.50$vertrauen 
      n missing  unique    Info    Mean   
     44       0      43       1  0.5046
> describe(RTG.UWD.short.50$Group)
RTG.UWD.short.50$Group 
      n missing  unique 
     44       0       2 

1 (34, 77%), 2 (10, 23%) 
> describe(RTG.UWD.short.50$trustee)
RTG.UWD.short.50$trustee 
      n missing  unique 
     44       0       2 

bad (22, 50%), good (22, 50%) 

and from Stata:

. sum vertrauen uw Goodguy

    Variable |        Obs        Mean    Std. Dev.       Min        Max
-------------+---------------------------------------------------------
   vertrauen |         44    .5045969    .3938847    .000998          1
          uw |         44    .2272727    .4239151          0          1
     Goodguy |         44          .5    .5057805          0          1

Best Answer

update: I´ve crossposted the question at statalist.org and got an answer there: http://www.statalist.org/forums/forum/general-stata-discussion/general/1348073-f-test-differences-stata-and-r

basically the problem was that the contrast settings differed between R and Stata & I did not compute the same SS type.

By default Stata computes type 3 SS, but I specified type 2 SS in R. But when computing type 3 SS in R, you should NOT use the default contrasts (contr.treatment), but instead use some orthogonal contrast (like contr.sum), see this link:
http://www.mail-archive.com/r-help@s.../msg69781.html

Thus when I did the ANOVA with type 3 SS and contr.sum, I got the same output as in Stata where I didn´t specify anything.