Solved – How to interpret logistic regression output for categorical variables when two categories are missing

binary datacategorical datainterpretationlogisticspss

I am using binary logistic regression; the dependent variable is 1 or 0; the independent variables are two groups: the first group includes continuous variables (LNTA: logarthim of total assets, ROA: return on assets, and Leverage; the second group includes categorical variables (Type of auditor: 1 or 0, Industry sector: 1,2,3, and 4, country: 1, 2, 3, 4, 5, 6, 7, 8, 9, and 10), and finally region: 1 or 2. The problem is that when I put all these independent variables together, I get results only for 8 countries not for 9 countries. I know that I have 10 countries and in the results table only 9 countries will appear as the country number 10 is reference category; but this case is different because both countries 9 and 10 are not included in the table.

Best Answer

This was going to be a comment asking for clarification, but I wanted to give a screenshot.

A quick question (which you might already know the answer to) -- do you have missing data from any of your variables? I'd suspect the most likely culprit is that you have at least one missing datapoint for all of the observations from "Country 9" and hence all of the Country 9 observations are excluded from analysis?

Running Logistic Regression in SPSS should start off with a "Case Processing Summary" table that will answer this for you. Here's an example from a dataset with no missing variables (I just blanked out the raw data filename).

enter image description here

EDIT: Example two, parameter specification. Just in case!

Another issue with SPSS is that the "Parameter Coding" doesn't necessarily correspond to your original values.

e.g. in the "Variables in the Equation" table, Country(7) doesn't necessarily mean the Country with the numerical value of 7, but rather the seventh parameter associated with the Country factor. You should check the "Categorical Variables Codings" table to make sure that all nine countries are showing up in that list.

In the example figure below, I mocked up a dataset with five countries and two regions. All values of the outcome for country 3 were set to missing (but all values of Region were complete). Country 3 is skipped from the parameter coding -- but you'll see that Country(3) [the third column of coding for the Country factor] actually pertains to Country==4 in the dataset.

Second set of tables showing parameter specification