I need some help understanding the coefficients produced by Python (Statsmodels) for Ordinal Regression vs. SPSS. I ran the same exact data set in both SPSS and Python, but received different output for the coefficients. The coefficients for each ordinal level are shown below. I'd like to reproduce exactly what SPSS generated with Python, but I cannot understand what Python is doing differently. FYI – there is only 1, continuous IV in the model. The DV is an ordinal variable with 6 levels.
SPSS:
0: .541
1: 2.644
2: 3.442
3: 4.117
4: 4.912
ind_var: .467
Python:
0: .541
1: .743291
2: -.2248
3: -.39435
4: -.22837
ind_var: .467
*Notice how the coefficient for the first level is the same for both Python and SPSS (.541), which leads me to believe that I should be able to "convert" these coefficients to be the same values as SPSS.
Here is the Python code, just in case anyone wants to try.
from statsmodels.miscmodels.ordinal_model import OrderedModel
import pandas as pd
from pandas.api.types import CategoricalDtype
import numpy as np
df = pd.read_csv("raw_data.csv")
cat_type = [0, 1, 2, 3, 4, 5]
df['dv_cat'] = pd.Categorical(df['dv'], categories=cat_type, ordered=True)
def or_regress(data):
or_model = OrderedModel.from_formula("dv_cat ~ iv_variable", data, distr='logit').fit(method='bfgs', disp=0)
print(model.summary())
return or_model
model_result = or_regress(df)
model_result.params
here is the same for SPSS:
PLUM dv WITH iv_variable
/CRITERIA=CIN(95) DELTA(0) LCONVERGE(0) MXITER(100) MXSTEP(5) PCONVERGE(1.0E-6) SINGULAR(1.0E-8)
/LINK=LOGIT
/PRINT=FIT PARAMETER SUMMARY.
Best Answer
The Python coefficients are the logarithms of the differences in the SPSS coefficients, so
0.541+exp(0.743291)
is 2.6430.541+exp(0.743291)+exp(-0.2248)
is 3.442and so on. This parametrisation ensures the thresholds are increasing.