Solved – Proportional odds assumption in ordinal logistic regression in R with the packages VGAM and rms

logisticordered-logitr

An assumption of the ordinal logistic regression is the proportional odds assumption. Using R and the 2 packages mentioned I have 2 ways to check that but I have questions in each one.

1) Using the rms package

Given the next commands

library(rms)
ddist <- datadist(Ki67,Cyclin_E)
options(datadist='ddist')
f <- lrm(grade ~Ki67+Cyclin_E);f
sf <- function(y)
c('Y>=1'=qlogis(mean(y >= 1)),'Y>=2'=qlogis(mean(y >= 2)),'Y>=3'=qlogis(mean(y >= 3)))
s <- summary(grade ~Ki67+Cyclin_E, fun=sf)
plot(s,which=1:3,pch=1:3,xlab='logit',main='',xlim=c(-2.5,2.5))

I have

lrm(formula = grade ~ Ki67 + Cyclin_E)

Frequencies of Missing Values Due to Each Variable
   grade     Ki67 Cyclin_E 
       0        0        3 


                     Model Likelihood     Discrimination    Rank Discrim.    
                        Ratio Test            Indexes          Indexes       

Obs            42    LR chi2     11.38    R2       0.268    C       0.728    
 1             11    d.f.            2    g        1.279    Dxy     0.456    
 2             15    Pr(> chi2) 0.0034    gr       3.592    gamma   0.458    
 3             16                         gp       0.192    tau-a   0.308    
max |deriv| 1e-07                         Brier    0.166                     


         Coef    S.E.   Wald Z Pr(>|Z|)
y>=2     -0.1895 0.8427 -0.22  0.8221  
y>=3     -2.0690 0.9109 -2.27  0.0231  
Ki67      0.0971 0.0330  2.94  0.0033  
Cyclin_E -0.0076 0.0227 -0.33  0.7387 

The s table gives: (unfortunately I don't know how to upload a graph made in R)

grade    N=45

+--------+-------+--+----+---------+----------+
|        |       |N |Y>=1|Y>=2     |Y>=3      |
+--------+-------+--+----+---------+----------+
|Ki67    |[ 2, 9)|12|Inf |0.6931472|-1.0986123|
|        |[ 9,16)|12|Inf |0.3364722|-2.3978953|
|        |[16,24)|10|Inf |2.1972246| 0.0000000|
|        |[24,44]|11|Inf |2.3025851| 1.5040774|
+--------+-------+--+----+---------+----------+
|Cyclin_E|[ 3,16)|15|Inf |1.0116009|-0.1335314|
|        |[16,22)| 7|Inf |1.7917595|-0.9162907|
|        |[22,33)|10|Inf |1.3862944|-0.8472979|
|        |[33,80]|10|Inf |0.4054651|-0.4054651|
|        |Missing| 3|Inf |      Inf| 0.6931472|
+--------+-------+--+----+---------+----------+
|Overall |       |45|Inf |1.1284653|-0.4054651|
+--------+-------+--+----+---------+----------+

Where for the Ki67 I see that 3 out of the 4 differences logit(P[Y> = 2])-logit(P[Y> = 3]) are close to 2. Only the last one is quite lower (around 0.8). But here Ki67 is continuous and not categorical so I don't know if the results of the table are correct and there isn't any p-value to decide. By the way I run the above in SPSS and I didn't reject the assumption.

2) Using the VGAM package

Here using the next commands I have the model under the assumption of proportional odds

library(VGAM)
fit1 <- vglm(grade ~Ki67+Cyclin_E,family=cumulative(parallel=T))
summary(fit1)

And the results

Coefficients:
                Estimate Std. Error  z value
(Intercept):1  0.1894723   0.820442  0.23094
(Intercept):2  2.0690395   0.886732  2.33333
Ki67          -0.0970972   0.032423 -2.99467
Cyclin_E       0.0075887   0.021521  0.35261

Number of linear predictors:  2 

Names of linear predictors: logit(P[Y< = 1]), logit(P[Y< = 2])

Dispersion Parameter for cumulative family:   1

Residual deviance: 79.86801 on 80 degrees of freedom

Log-likelihood: -39.93401 on 80 degrees of freedom

Number of iterations: 5 

While using the next commands I have the model without the assumption of proportional odds

fit2 <- vglm(grade ~Ki67+Cyclin_E,family=cumulative(parallel=F))

where unfortunately i receice the next message

Warning message: In vglm.fitter(x = x, y = y, w = w, offset = offset,
Xm2 = Xm2, : convergence not obtained in 30 iterations

However if I type summary(fit2) I get results but again I don't know if they are correct. My intention was to use the next commands and get the answer but know I doubt if this is correct (by the way if I do it I get p-value=0.6.

pchisq(deviance(fit1)-deviance(fit2),
df=df.residual(fit1)-df.residual(fit2),lower.tail=FALSE)

So, regarding the methods mentioned above does anyone knows whether the results I get are valid or in the case of the VGAM package is there any way to increase the number of itterations?Is there any other way to check it?

Best Answer

I recommend partial residual plots using the rms package's lrm and residuals.lrm functions. You can also fit a series of binary models using different cutoffs for $Y$ and plot the log odds ratios vs. cutoff.

Related Question