Solved – Can a -2 Log likelihood be calculated with only one model

deviancegeneralized linear modellikelihood-ratioMATLABp-value

I am using the glmfit function in MATLAB. The function only returns the deviance and not the log likelihood. I understand that the deviance is basically twice the difference between the log likelihoods of the models but what I don't get is I am only using glmfit to create one model, but somehow I am getting a deviance.

  • Doesn't calculation of the -2 Log likelihood require 2 models?
  • How can the deviance be analysed when there is only one model?

Another question I am having is say I did have two models and that I was comparing them using log likelihood test. The null hypothesis would be the first model and the alternative hypothesis would be the second model. After getting the log likelihood test statistic would I check it against the chi squared cdf to determine the p-value? Am I right that if it is less than the alpha level I would reject the null and if it is greater I would fail to reject the null?

Best Answer

The statistical term deviance is thrown around a bit too much. Most of the time, programs return the deviance $$ D(y) = -2 \log{\{p(y | \hat{\theta})\}},$$ where $\hat{\theta}$ is your estimated parameter(s) from model fitting and $y$ is some potentially observed/observable occurrence of the random quantity in question.

The more common deviance that you refer to would treat the deviance above as a function of two variables, both the data and the fitted parameters: $$ D(y,\hat{\theta}) = -2\log{\{p(y|\hat{\theta})\}}$$ and so if you had one $y$ value but two competing, fitted parameter values, $\hat{\theta}_{1}$ and $\hat{\theta}_{2}$, then you'd get the deviance you mentioned from $$-2(\log{\{p(y|\hat{\theta}_{1})\}} - \log{\{p(y|\hat{\theta}_{2})\}}). $$ You can read about the Matlab function that you mentioned, glmfit(), linked here. A more fruitful, though shorter, discussion of the deviance is linked here.

The deviance statistic implicitly assumes two models: the first is your fitted model, returned by glmfit(), call this parameter vector $\hat{\theta}_{1}$. The second is the "full-model" (also called the "saturated model"), which is a model in which there is a free variable for every data point, call this parameter vector $\hat{\theta}_{s}$. Having so many free variables is obviously a stupid thing to do, but it does allow you to fit to that data exactly.

So then, the deviance statistics is computed as the difference between the log likelihood computed at the fitted model and the saturated model. Let $Y=\{y_{1}, y_{2}, \cdots, y_{N}\}$ be the collection of the N data points. Then:

$$DEV(\hat{\theta}_{1},Y) = -2\biggl[\log{p(Y|\hat{\theta}_{1})} - \log{p(Y|\hat{\theta}_{s})} \biggr]. $$ The terms above will be expanded into summations over the individual data points $y_{i}$ by the independence assumption. If you want to use this computation to calculate the log-likelihood of the model, then you'll need to first calculate the log-likelihood of the saturated model. Here is a link that explains some ideas for computing this... but the catch is that in any case, you're going to need to write down a function that computes the log-likelihood for your type of data, and in that case it's probably just better to create your own function that computes the log-likelihood yourself, rather than backtracking it out of a deviance calculation.

See Chapter 6 of Bayesian Data Analysis for some good discussion of deviance.

As for your second point about the likelihood test statistic, yes it sounds like you basically know the right thing to do. But in many cases, you'll consider the null hypothesis to be something that expert, external knowledge lets you guess ahead of time (like some coefficient being equal to zero). It's not necessarily something that comes as the result of doing model fitting.