Solved – Holdout sample for multinomial logistic regression in SPSS

logisticout-of-sampleregressionsamplespss

I am having a multiple categorical dependent variable and continuous independent variables. Via a multinomial logistic regression in SPSS I want to test whether the training sample makes good estimations for my validation sample. However, I do not understand how this can be tested in SPSS.
I split the data into two parts (70%-30%), but I do not understand how to test the data output of the 70% sample on the 30% sample.
I also tried a Bootstrap in SPSS but I cannot find exactly how this should be done in SPSS.

Best Answer

  1. Generate a random 70% of your data using Data > Select Cases > Random Sample of Cases. This will create a variable named FILTER_\$ indicating which cases are selected

  2. Run your mlogistic model specifying on the Save dialog to export the model to XML.

  3. Compute FILTER_\$ = 1-FILTER_\$.

  4. Using Utilities > Scoring Wizard, select the model file exported and other specifications.

You will then have the predicted value or other output statistics from the model as you select in step 4. You can then run whatever summary statistics you want on that sample or turn off the filter and use Data > Split File to generate the same statistics for both samples.

As for bootstrap, if you have the bootstrap option, you can specify bootstrapping on the logistic dialog.

Related Question