I'm familiar with (some) approaches to evaluating the fit (or accuracy) of a binary (logistic) model (e.g. AUC). Are there methods/approaches that are particularly well-suited for a binomial (success vs. failure) model?
If the suggestion is to use a variant of a (pseudo-) R-square, what are recommended approaches? I can think of a few, using logit vs. response-scale predictions and with and without weighting by # of subjects, but am unsure of their appropriateness:
[s
= # of successes, f
= # of failures]
-
summary(lm(predict(model)~I(s/(s+f))))$r.square
-
summary(lm(predict(model,type='response')~I(s/(s+f))))$r.square
-
summary(lm(predict(model)~I(s/(s+f)),weights=s+f))$r.square
-
summary(lm(predict(model,type='response')~I(s/(s+f)),weights=s+f))$r.square
-
1-var(residuals(model))/(var(s/(s+f)))
Best Answer
In regression, a binomial response is basically a compact way of representing multiple (independent) binary observations that have the same values of the predictors. From that, you can decompose a single observation with the proportion $S/(S + F)$ into $S + F$ observations: $S$ successes and $F$ failures. Note that you do need to know both the numerator and denominator of the proportion; you can't get by with just the proportion itself.
To take your example of $S = 1$ and $F = 3$, and a predicted probability of $0.3$: you would treat this as 1 case with a binary response value of $1$, and 3 cases with a binary response of $0$. So yes, you are comparing the two vectors $Y_\text{obs} = \{1,0,0,0\}$ and $\hat{Y} = \{0.3, 0.3, 0.3, 0.3\}$.