Solved – Interpreting score function and information matrix in logistic regression

fisher informationlikelihoodlogisticmachine learning

When performing classification using logistic regression, can the score function (the gradient of the log-likelihood with respect to the weights) and/or the Fisher information matrix (inverse of the log-likelihood Hessian taken w.r.t. to weights) be used to compare the quality of fit of two sets of weights (without calculating the log-likelihood)? I know that if the score function is zero for the first set of weights, then it is optimal and the log-likelihood is greater than or equal to the log-likelihood of the second set of weights. Can a more general statement be made for suboptimal weights?

Best Answer

Yes, the score test is asymptotically equivalent to the likelihood ratio test. That means it is a test for nested models. It is evaluated using the score and information under the null hypothesis. That is an important advantage over the LRT and Wald tests.