Solved – Why are these hierarchical linear regression results in R and SPSS different

rregressionspss

When I try to use the data and example for HLR from this example dataset taken from this post in the R Tutorial Series on hierarchical linear regression the results don't match when I try to use the same method in SPSS. Is it because SPSS is using a different type of sum of squares (III)?

The F values match for the final model, but not the 2nd one and some of the sum of squares seem off.

#R method
url <- "http://dl.dropbox.com/u/10246536/Web/RTutorialSeries/dataset_hlr.csv"
datavar <- read.csv(url, header=T)

#create three linear models using lm(FORMULA, DATAVAR)
#one predictor model
onePredictorModel <- lm(ROLL ~ UNEM, datavar)
#two predictor model
twoPredictorModel <- lm(ROLL ~ UNEM + HGRAD, datavar)
#three predictor model
threePredictorModel <- lm(ROLL ~ UNEM + HGRAD + INC, datavar)

#get summary data for each model using summary(OBJECT)
summary(onePredictorModel)
summary(twoPredictorModel)
summary(threePredictorModel)

#compare successive models using anova(MODEL1, MODEL2, MODELi)
test<- anova(onePredictorModel, twoPredictorModel, threePredictorModel)

Below here is the code for SPSS.

*SPSS method
data list free /YEAR ROLL UNEM HGRAD INC.
begin data
1   5501    8.1 9552    1923
2   5945    7   9680    1961
3   6629    7.3 9731    1979
4   7556    7.5 11666   2030
5   8716    7   14675   2112
6   9369    6.4 15265   2192
7   9920    6.5 15484   2235
8   10167   6.4 15723   2351
9   11084   6.3 16501   2411
10  12504   7.7 16890   2475
11  13746   8.2 17203   2524
12  13656   7.5 17707   2674
13  13850   7.4 18108   2833
14  14145   8.2 18266   2863
15  14888   10.1    19308   2839
16  14991   9.2 18224   2898
17  14836   7.7 18997   3123
18  14478   5.7 19505   3195
19  14539   6.5 19800   3239
20  14395   7.5 19546   3129
21  14599   7.3 19117   3100
22  14969   9.2 18774   3008
23  15107   10.1    17813   2983
24  14831   7.5 17304   3069
25  15081   8.8 16756   3151
26  15127   9.1 16749   3127
27  15856   8.8 16925   3179
28  15938   7.8 17231   3207
29  16081   7   16816   3345
end data.


REGRESSION /MISSING LISTWISE 
/STATISTICS COEFF OUTS R ANOVA CHANGE 
/CRITERIA=PIN (.05) POUT(.10) 
/NOORIGIN /DEPENDENT ROLL 
/METHOD=ENTER UNEM 
/METHOD=ENTER HGRAD 
/METHOD=ENTER INC.

Or did I mess something up in the procedure for SPSS?

Best Answer

SPSS outputs are correct.

If you do the following in R, they match with SPSS outputs:

anova(onePredictorModel, twoPredictorModel)

anova(onePredictorModel)

R outputs match with SPSS outputs and you could also compute the correct F values by yourself.

You could calculate the F values by the following formula, and the F values should be matched by any statistics software:

F = [(R-squared change from Step 1 model) / number of IVs added] / [(1 - Step 2 R-squared) / (N - k - 1)]

where: 1. R-squared change from Step 1 model= Step 2 R-squared - Step 1 R-squared

  1. number of IVs added= Number of variables in step 2 - Number of variables in step 1

  2. N = total number of cases

  3. k = Number of variables in step 2

This F is judged for statistical significance with (N - k - 1) df.