Solved – Hosmer-Lemeshow test in R

goodness of fithosmer-lemeshow-testr

The Hosmer-Lemeshow test is a statistical test for goodness of fit for logistic regression models. According to ?hoslem.test, it deals only with binary logistic regression. However, I wonder if this test can be used to a ordered logit model which has more than 2 levels for the dependent variable.

The ordered logit model is also known as the proportional odds model, or a cumulative logit model. And I use the "Ordinal" package. Thanks a lot.

Best Answer

Update 28 July: the below has now been pushed to CRAN in the generalhoslem package along with the Lipsitz and Pulkstenis-Robinson tests.


Fagerland and Hosmer discuss a generalisation of the Hosmer-Lemeshow test and two other approaches (the Lipsitz test and Pulkstenis-Robinson tests) in A goodness-of-fit test for the proportional odds regression model 2013 Stat Med and in Tests for goodness of fit in ordinal logistic regression models (2016) Journal of Statistical Computing and Simulation.

I haven't checked properly yet, but as far as I know, they haven't been implemented in R. If that turns out to be the case, I plan to add them to the generalhoslem package.

EDIT: it's probably also worth pointing out Fagerland and Hosmer's recommendation that

because the tests may detect different types of lack of fit, a thorough assessment of goodness of fit requires use of all three approaches.

ANOTHER EDIT: I didn't realise this straight away but the only difference between the multinomial version of the Hosmer-Lemeshow test and the ordinal version is in the degrees of freedom. Strictly speaking, the test for ordinal response models requires sorting the observations by a weighted ordinal 'response score' but they show in the 2012 article above that that is equivalent to binning the observations in the same way as in the multinomial case.

I haven't properly tested it but the the logitgof function on my github page should now do the trick: https://github.com/matthewjay15/generalhoslem-v1.1.0/blob/1.2.0.9000/logitgof.R Note that if you examine the observed and expected tables produced by this function, then the columns may not be in the right order (they will be alphabetical). This shouldn't make any difference to the test statistic, though, as they will correspond to each other.

An example using the ordinal and MASS packages:

library(reshape) # needed by logitgof
logitgof <- function (obs, exp, g = 10, ord = FALSE) {
  DNAME <- paste(deparse(substitute(obs)), deparse(substitute(exp)), sep = ", ")
  yhat <- exp
  if (is.null(ncol(yhat))) {
    mult <- FALSE
  } else {
    if (ncol(yhat) == 1) {
      mult <- FALSE
    } else mult <- TRUE
  }
  n <- ncol(yhat)
  if (mult) {
    if (!ord) {
      METHOD <- "Hosmer and Lemeshow test (multinomial model)"
    } else {
      METHOD <- "Hosmer and Lemeshow test (ordinal model)"
    }
    qq <- unique(quantile(1 - yhat[, 1], probs = seq(0, 1, 1/g)))
    cutyhats <- cut(1 - yhat[, 1], breaks = qq, include.lowest = TRUE)
    dfobs <- data.frame(obs, cutyhats)
    dfobsmelt <- melt(dfobs, id.vars = 2)
    observed <- cast(dfobsmelt, cutyhats ~ value, length)
    observed <- observed[order(c(1, names(observed[, 2:ncol(observed)])))]
    dfexp <- data.frame(yhat, cutyhats)
    dfexpmelt <- melt(dfexp, id.vars = ncol(dfexp))
    expected <- cast(dfexpmelt, cutyhats ~ variable, sum)
    expected <- expected[order(c(1, names(expected[, 2:ncol(expected)])))]
    stddiffs <- abs(observed[, 2:ncol(observed)] - expected[, 2:ncol(expected)]) / sqrt(expected[, 2:ncol(expected)])
    if (ncol(expected) != ncol(observed)) stop("Observed and expected tables have different number of columns. Check you entered the correct data.")
    chisq <- sum((observed[, 2:ncol(observed)] - expected[, 2:ncol(expected)])^2 / expected[, 2:ncol(expected)])
    if (!ord) {
      PARAMETER <- (nrow(expected) - 2) * (ncol(yhat) - 1) 
    } else {
      PARAMETER <- (nrow(expected) - 2) * (ncol(yhat) - 1) + ncol(yhat) - 2
    }
  } else {
    METHOD <- "Hosmer and Lemeshow test (binary model)"
    if (is.factor(obs)) {
      y <- as.numeric(obs) - 1
    } else {
      y <- obs
    }
    qq <- unique(quantile(yhat, probs = seq(0, 1, 1/g)))
    cutyhat <- cut(yhat, breaks = qq, include.lowest = TRUE)
    observed <- xtabs(cbind(y0 = 1 - y, y1 = y) ~ cutyhat)
    expected <- xtabs(cbind(yhat0 = 1 - yhat, yhat1 = yhat) ~ cutyhat)
    stddiffs <- abs(observed - expected) / sqrt(expected)
    chisq <- sum((observed - expected)^2/expected)
    PARAMETER <- nrow(expected) - 2
  }
  if (g != nrow(expected))
    warning(paste("Not possible to compute", g, "rows. There might be too few observations."))
  if (any(expected[, 2:ncol(expected)] < 1))
    warning("At least one cell in the expected frequencies table is < 1. Chi-square approximation may be incorrect.")
  PVAL <- 1 - pchisq(chisq, PARAMETER)
  names(chisq) <- "X-squared"
  names(PARAMETER) <- "df"
  structure(list(statistic = chisq, parameter = PARAMETER, 
                 p.value = PVAL, method = METHOD, data.name = DNAME, observed = observed, 
                 expected = expected, stddiffs = stddiffs), class = "htest")
}

library(foreign) # just to download the example dataset

# with the ordinal package
library(ordinal)
ml <- read.dta("http://www.ats.ucla.edu/stat/data/hsbdemo.dta")
mod1 <- clm(ses ~ female + write + read, data = ml)

# extract predicted probs for each level of outcome
predprob <- data.frame(id = ml$id, female = ml$female, read = ml$read, write = ml$write)
fv <- predict(mod1, newdata = predprob, type = "prob")$fit
logitgof(ml$ses, fv, ord = TRUE) # set ord to TRUE to run ordinal instead of multinomial test

# with MASS
library(MASS)
mod2 <- polr(ses ~ female + write + read, data = ml)
logitgof(ml$ses, fitted(mod2), ord = TRUE)
Related Question