Solved – Analogous measure of AIC which uses the posterior distribution for model selection

aicbayesianmarkov-chain-montecarlomodel selection

Suppose the following problem: I have $n$ models, $M_k$, each with parameters $\mathbf{\theta}_k$ for a data set $D$. There where previous observations of a subset of the parameters which are common to every model $M_k$ (i.e., I have well defined priors for a subset of the parameters $\theta_k$), so I performed an MCMC algorithm in order to obtain the posterior distribution of each model using that prior information, i.e., I have $p(\theta_k|D,M_k)$, and have to decide which of those models is the 'correct' one.

I was thinking in defining what do I mean by 'the correct' one, and came up with the idea that I have to decide which of the posterior distributions is closer to the 'real' posterior distribution that generated the data (which may or may not be in my set of posterior distributions). I was thinking of using bayes factors, but I keep thinking that I need something like the AIC which, instead of using the likelihood and the corresponding MLE estimates, uses the posterior distributions and the corresponding maximum-a-posteriori estimates. My idea is to obtain an unbiased (or nearly unbiased) estimator of the KL divergence between the real posterior and my posteriors (understanding that the AIC is an estimator of the KL divergence between the 'real' likelihood and the likelihood of my models).

Is there something like this in the statistical literature? I'm just kind of crazy of thinking the problem like this?

Best Answer

None of these information criteria are unbiased, but under some conditions they are consistent estimators of the out-of-sample deviance. They also all utilize the likelihood in some fashion, but the WAIC and the LOOIC differ from the AIC and the DIC in that the former two average the likelihood for each observation over (draws from) the posterior distribution, whereas the latter two plug in point estimates. In this sense, the WAIC and LOOIC are preferable because they do not make an assumption that the posterior distribution is multivariate normal, with the LOOIC being somewhat preferable to the WAIC because it can be made more robust to outliers and has a diagnostic that can be evaluated to see if its assumptions are met.

Overview article

More detail about the practicalities

R package

Related Question