Bayesian p-Values – Understanding and Application

bayesianp-value

I'm looking for an answer that would satisfy a reader who understands frequentist p-values but only understands the rudiments of Bayesian approaches to statistics.

At present google searches do not reveal a definition either on a Wikipedia page or any other commonly accepted resource.

This question seems related but isn't really since it transpired that the user was not actually calculating Bayesian p-values. However, the accepted answer links to this Gelman paper in explanation of what Bayesian p-values are.

Best Answer

If I understand it correctly, then a Bayesian p-value is the comparison of a some metric calculated from your observed data with the same metric calculated from your simulated data (being generated with parameters drawn from the posterior distribution).

In Gelmans words: "From a Bayesian context, a posterior p-value is the probability, given the data, that a future observation is more extreme (as measured by some test variable) than the data"

For example, the number of zeros generated from a poisson based model could be such a metric or test statistic, and you could calculate how many of your simulated datasets have a larger fraction of zeros than you actually observe in your real data. The closer this value to 0.5, the better the values calculated from your simulated data distribute around the real observation.

Related Solutions

Bayesian – What Are the Cons of Bayesian Analysis?

I'm going to give you an answer. Four drawbacks actually. Note that none of these are actually objections that should drive one all the way to frequentist analysis, but there are cons to going with a Bayesian framework:

Choice of prior. This is the usual carping for a reason, though in my case it's not the usual "priors are subjective!" but that coming up with a prior that's well reasoned and actually represents your best attempt at summarizing a prior is a great deal of work in many cases. An entire aim of my dissertation, for example, can be summed up as "estimate priors".
It's computationally intensive. Especially for models involving many variables. For a large dataset with many variables being estimated, it may very well be prohibitively computationally intensive, especially in certain circumstances where the data cannot readily be thrown onto a cluster or the like. Some of the ways to resolve this, like augmented data rather than MCMC, are somewhat theoretically challenging, at least to me.
Posterior distributions are somewhat more difficult to incorporate into a meta-analysis, unless a frequentist, parametric description of the distribution has been provided.
Depending on what journal the analysis is intended for, either the use of Bayes generally, or your choice of priors, gives your paper slightly more points where a reviewer can dig into it. Some of these are reasonable reviewer objections, but some just stem from the nature of Bayes and how familiar people in some fields are with it.

None of these things should stop you. Indeed, none of these things have stopped me, and hopefully doing Bayesian analysis will help address at least number 4.

Bayesian GLM – Understanding $p$-Values in Bayesian GLM

Great question! Although there are Bayesian p-values, and one of the authors of the arm package is an advocate, what you are seeing in your output is not a Bayesian p-value. Check the class of model

class(model)
"bayesglm" "glm"      "lm"

and you can see that class bayesglm inherits from glm. Furthermore, examination of the arm package shows no specific summary method for a bayesglm object. So when you do

summary(model)

you are actually doing

summary.glm(model)

and getting frequentist interpretation of the results. If you want a more Bayesian perspective the function in arm is display()

Best Answer

Related Solutions

Bayesian – What Are the Cons of Bayesian Analysis?

Bayesian GLM – Understanding $p$-Values in Bayesian GLM

Related Question