Bayesian – What Is pm.Potential in PyMC3?

bayesianpymc

I'm going through the Price Is Right example in chapter 5 of Probabilistic Programming & Bayesian Methods for Hackers.

It reads:

Example: Optimizing for the Showcase on The Price is Right

Bless you if you are ever chosen as a contestant on the Price is
Right, for here we will show you how to optimize your final price on
the Showcase. For those who forget the rules:

Two contestants compete in The Showcase.

Each contestant is shown a unique suite of prizes.

After the viewing, the contestants are asked to bid on the price for their unique suite of prizes.

If a bid price is over the actual price, the bid's owner is disqualified from winning.

If a bid price is under the true price by less than \$250, the winner is awarded both prizes.

The difficulty in the game is balancing your uncertainty in the
prices, keeping your bid low enough so as to not bid over, and trying
to bid close to the price.

Suppose we have recorded the Showcases from previous The Price is
Right episodes and have prior beliefs about what distribution the true
price follows. For simplicity, suppose it follows a Normal:

$$\text{True Price} \sim \text{Normal}(\mu_p, \sigma_p )$$

In a later chapter, we will actually use real Price is Right Showcase
data to form the historical prior, but this requires some advanced
PyMC3 use so we will not use it here. For now, we will assume $\mu_p =
> 35 000$ and $\sigma_p = 7500$.

We need a model of how we should be playing the Showcase. For each
prize in the prize suite, we have an idea of what it might cost, but
this guess could differ significantly from the true price. (Couple
this with increased pressure being onstage and you can see why some
bids are so wildly off). Let's suppose your beliefs about the prices
of prizes also follow Normal distributions:

$$ \text{Prize}_i \sim \text{Normal}(\mu_i, \sigma_i ),\;\; i=1,2$$

This is really why Bayesian analysis is great: we can specify what we
think a fair price is through the $\mu_i$ parameter, and express
uncertainty of our guess in the $\sigma_i$ parameter.

We'll assume two prizes per suite for brevity, but this can be
extended to any number. The true price of the prize suite is then
given by $\text{Prize}_1 + \text{Prize}_2 + \epsilon$, where
$\epsilon$ is some error term.

We are interested in the updated $\text{True Price}$ given we have
observed both prizes and have belief distributions about them. We can
perform this using PyMC3.

Lets make some values concrete. Suppose there are two prizes in the
observed prize suite:

A trip to wonderful Toronto, Canada!

A lovely new snowblower!

We have some guesses about the true prices of these objects, but we
are also pretty uncertain about them. I can express this uncertainty
through the parameters of the Normals:

$$\begin{align}\text{snowblower} \sim \text{Normal}(3 000, 500 )\\\\\text{Toronto} \sim \text{Normal}(12 000, 3000 )\\\\\end{align}$$

For example, I believe that the true price of the trip to Toronto is
12 000 dollars, and that there is a 68.2% chance the price falls 1
standard deviation away from this, i.e. my confidence is that there is
a 68.2% chance the trip is in [9 000, 15 000].

The code that was provided is the following:

import pymc3 as pm

data_mu = [3e3, 12e3]

data_std = [5e2, 3e3] 

mu_prior = 35e3
std_prior = 75e2

with pm.Model() as model:

    true_price = pm.Normal("true_price", mu=mu_prior, sd=std_prior)

    prize_1 = pm.Normal("first_prize", mu=data_mu[0], sd=data_std[0])
    prize_2 = pm.Normal("second_prize", mu=data_mu[1], sd=data_std[1])
    price_estimate = prize_1 + prize_2

    logp = pm.Normal.dist(mu=price_estimate, sd=(3e3)).logp(true_price)
    error = pm.Potential("error", logp)

    trace = pm.sample(50000, step=pm.Metropolis())
    burned_trace = trace[10000:]

price_trace = burned_trace["true_price"]

I don't understand:

How does the true_price fit in with price_estimate?
Where did sd=(3e3) come from?
What is a pm.Potential object?

Any help would greatly be appreciated. Thanks!

Best Answer

We use pm.Potential here primarily to get around the definition of a likelihood. We ordinarily use it to constrain our likelihood in the manner described in the PyMC docs, but in this example we never end up defining a true likelihood (which would require the inclusion of observations). As such, all the samples that we draw are based on how we defined the potential.

Our price_estimate and true_price are related to each other in the potential by essentially making our true_price the observed values. When we say:

logp = pm.Normal.dist(mu=price_estimate, sd=(3e3)).logp(true_price)

We are evaluating a normal distribution with mean of price_estimate, standard devation of 3e3, at the values provided by true_price (our mock observations). This simulates a likelihood that we can then sample from to get our posteriors. As for the validity of 3e3 as a the standard deviation, I think it is reasonable, given that it is the larger of the standard deviations that we used to define the components of our price_estimate here:

data_std = [5e2, 3e3]

I kept "error" as the name of the variable because that's how Cam named the function when he used the pm.potential decorator in the PyMC version of this chapter.

Please let me know if this is unclear!

Related Solutions

Solved – Bayesian model selection in PyMC3

You can compute the likelihood of a model indeed using model.logp(). As input, it requires a point. For example, the BEST model from the examples directory I can do:

np.exp(model.logp({'group1_mean': 0.1, 
                   'group2_mean': 0.2, 
                   'group1_std_interval': 1., 
                   'group2_std_interval': 1.2, 
                   'nu_minus_one_log': 1}))

Note that this model is using transformed variables, so I have to supply these. You could then take the exp() of that and use it inside a numerical integrator, for example as provided by scipy.integrate. The problem is that even with only 5 parameters, this will be very slow.

Bayes Factors are generally very difficult to compute because you have to integrate over the complete parameter space. There are some ideas to using MCMC samples for that. See this post, and especially the comment section for more information: https://radfordneal.wordpress.com/2008/08/17/the-harmonic-mean-of-the-likelihood-worst-monte-carlo-method-ever/ The case for BIC is unfortunately similar.

If you really want to compute the Bayes Factor, you can also look at the Savage Dickey Ratio test (see e.g. http://drsmorey.org/bibtex/upload/Wagenmakers:etal:2010.pdf), but it's application is limited.

I suppose that you're trying to do model comparison which is a field with many opinions and solutions (some hard to implement, like BFs). One measure that is very easy compute is the Deviance Information Criterion. It has its downsides, although some of them can be remedied (see http://onlinelibrary.wiley.com/doi/10.1111/rssb.12062/abstract). Unfortunately we haven't ported the code pymc3 yet, but it'd be pretty easy (see here for the pymc2 implementation: https://github.com/pymc-devs/pymc/blob/895c24f62b9f5d786bce7ac4fe88edb4ad220364/pymc/MCMC.py#L410).

Kruschke favors the approach to just build the full model and let it tell you which parameters matter. You could also build variable selection into the model itself (see e.g. http://arxiv.org/pdf/math/0505633.pdf).

Finally, for a much more complete treatment, see this recent blog post: http://jakevdp.github.io/blog/2015/08/07/frequentism-and-bayesianism-5-model-selection/

Solved – Regression Mixture in PYMC3

An alternative is to use the marginalized mixture model (see also this SO answer). This utilizes the NUTS using ADVI and converges within 6000 samples.

import theano.tensore as tt
ncls = 2
with pm.Model() as basic_model:
    w = pm.Dirichlet('w', np.ones(ncls))
    alpha = pm.Normal('alpha', mu=0, sd=10)
    beta = pm.Normal('beta', mu=0, sd=100, shape=ncls)
    sigma  = pm.Uniform('sigma', 0, 20)

    mu = tt.stack([alpha + beta[0]*X1,
                   alpha + beta[1]*X1], axis=1)

    y_obs = pm.NormalMixture('y_obs', w, mu, tau=sigma, observed=Y)

with basic_model:
    trace = pm.sample(5000, n_init=10000, tune=1000)[1000:]

Example: Optimizing for the Showcase on The Price is Right

Best Answer

Related Solutions

Solved – Bayesian model selection in PyMC3

Solved – Regression Mixture in PYMC3

Related Question