Solved – How to perform RMSE analysis in SPSS

forecastingregressionspss

My thesis coach wants me to perform a predictive analysis based on OLS. What I understand is the following:

  • divide the dataset into a training set and a holdout set, for instance 50-50
  • perform OLS on the training set
  • construct linear equation based on regression output
  • create a new variable (DV2) in the holdout set, and use the linear equation to
    calculate its values
  • now you have F (forecasted) and A (actual) DV values in the holdout set
  • calculate the performance of the predictive linear equation with RMSE:
    enter image description here
  • a lower RMSE is better

Now, some questions about this procedure:

  • Am I doing this right?
  • I have no clue how to have SPSS perform the RSME operation, so can't I just do it in Excel? If I paste the holdout set into Excel, performing this calculation seems easy enough. Is there something I'm missing?
  • If you know how to perform this calculation in SPSS, please let me know because I expect that SPSS might be able to output some extra insightful statistics and / or graphs

Best Answer

Compute your random sample definition, e.g.,

compute part = rv.uniform(0,1) <= .5.

Run the regression. Include this subcommand

/SELECT part EQ 1

and this

/SAVE PRED RESID

You can do this by specifying a selection variable in the Regression dialog box and by using the Save subdialog.

Now select the other part of the data, e.g.,

compute holdout = 1 - part.

Run Descriptives on RES_1.