Solved – SPSS: Comparing Regression Coefficient for 2 Models

regressionspss

Hope you guys could help me with a question I've been stuck on for a while.

I'm currently writing my thesis on how MRT (the railway system in Singapore) accessibility affects prices of public housings (HDBs). So far I've gotten quite far but have reached a wall. I have collected 11,000 HDB transactions, all which are within walking distance of MRT station (<750m to the nearest MRT Station or <10min walk to the nearest MRT Station). Apart from the usual structural variables:

  • Age
  • Floor
  • Size

I also managed to obtain locational variables such as:

  • Time taken to walk to nearest railway station (time_walk)
  • Time taken to commute, via train, from the nearest station to the CBD station (time_train)
  • Total Traveling Time, time_walk + time_train (TTT)

With these variables, I ran a multiple regression with Price as the DV and Age, Floor, Size, time_walk, and time_train as the IVs. This then produced regression coefficients for the DV. With these coefficient it allowed me to analyse the quantitative impact that, with everything else constant, (i) each additional walking minute, and (ii) each additional commuting minute on the MRT, had on the pricing of HDB.

However, what I would like to investigate is, do residents living at different distances from the CBD value time_walk differently?

I understand that I can't create 3 models (shown below), each containing only the relevant details (eg. 0-9 mins train time, 10-19 mins train time …etc) as the n number would be different, thus, comparing the coefficient estimates wouldn't be fair

Model A: (0-9 mins time_train): How would Walking_Time affects house pricing?

Model B: (10-19 mins time_train): How would Walking_Time affects house pricing?

Model C: (20-29 mins time_train): How would Walking_Time affects house pricing?

Any advice would be greatly appreciated.

Best Answer

Sounds like you need an interaction term between walk_time and train_time. Certainly not three separate models.

If I've understood your setup right and if people tend to want to minimise or lower bound total travel time, then I would expect these two variables to be negatively related as people trade them off.

In SPSS I believe you have to construct interaction terms manually. The web is full of people who'll show you how to do that. Then you have to interpret the interaction properly. That will require a little reading, but any regression textbook should cover it.