I coded two models in pymc3, which I thought are quite simple.
Logistic Regression
The first is a logistic regression in an experiment that models correct and wrong answers for specific tasks in a 2×2 factorial design. Here's what it looks like:
with pm.Model() as m1:
b_spe = pm.Normal('b_spe', 0, 5)
b_noi = pm.Normal('b_noi', 0, 5)
b_sn = pm.Normal('b_sn', 0, 5)
sigma_sub = pm.HalfCauchy('sigma_sub', 1)
sigma_run = pm.HalfCauchy('sigma_run', 1)
a = pm.Normal('a', 0, 5)
a_sub = pm.Normal('a_sub', 0, sigma_sub, shape=len(d['sub'].unique()))
a_run = pm.Normal('a_run', 0, sigma_run, shape=len(d['run'].unique()))
p = pm.invlogit(a + a_sub[d["sub"].values] + a_run[d["run"].values] + b_spe*d.speech
+ b_noi * d.noise + b_sn * d.speech * d.noise)
hits = pm.Binomial('hits', p=p, n=d.total, observed=d.hit)
trace_m1 = pm.sample(5000, tune=5000, njobs=4, target_accept=.99)
After sampling I get the following warning:
Sampling 4 chains: 100%|██████████| 40000/40000 [03:38<00:00,
37.86draws/s] The acceptance probability does not match the target. It is 0.8942185942514685, but should be close to 0.8. Try to increase the
number of tuning steps. The number of effective samples is smaller
than 25% for some parameters.
Rhat(a) = 1.0002822268704348
The traceplots look fine, at least to me, and r-hat is very close to 1. Is there a problem with the model? What can I do to fix it?
Linear Regression
This is related to the experiment above, but I use fMRI data that I model with Student's T-distributions for the longer tails. It's still a 2×2 factorial design, and I also model the covariance between intercepts and slopes of each subject using an LKJ covariance matrix. Furthermore, the model is off-center, inspired by the models of chapter 13.23 from Statistical Rethinking.
with pm.Model() as m2:
sd_dist = pm.HalfCauchy.dist(beta=2)
pchol_sub = pm.LKJCholeskyCov('pchol_sub', eta=4, n=n_dim, sd_dist=sd_dist)
chol = pm.expand_packed_triangular(n_dim, pchol_sub, lower=True)
a = pm.Normal('a', 0, 1)
sigma_sub = pm.HalfCauchy('sigma_sub', 1)
a_sub = pm.Normal('a_sub', 0, sigma_sub, shape=n_subs)
# includes beta for spe, noi, sn
b_spe = pm.StudentT('b_spe', nu=3, mu=0, lam=1)
b_noi = pm.StudentT('b_noi', nu=3, mu=0, lam=1)
b_sn = pm.StudentT('b_sn', nu=3, mu=0, lam=1)
# per subject
b_s_sub = tt.dot(chol, pm.StudentT('b_s_sub', nu=3, mu=0, lam=1, shape=(1, n_subs)))
b_n_sub = tt.dot(chol, pm.StudentT('b_n_sub', nu=3, mu=0, lam=1, shape=(1, n_subs)))
b_sn_sub = tt.dot(chol, pm.StudentT('b_sn_sub', nu=3, mu=0, lam=1, shape=(1, n_subs)))
A = a + a_sub[d['sub'].values]
B_s = b_spe + b_s_sub[0, d['sub'].values]
B_n = b_noi + b_n_sub[0, d['sub'].values]
B_sn = b_sn + b_sn_sub[0, d['sub'].values]
sigma = pm.HalfCauchy('sigma', 2)
lam = sigma ** -2
nu = pm.Exponential('ν_minus_one', 1/29.) + 1
mu = pm.Deterministic('mu', A + B_s * d.speech + B_n * d.noise + B_sn * d.speech * d.noise)
psc = pm.StudentT('psc', nu=nu, mu=mu, lam=lam, observed=d.psc_c)
trace_m2 = pm.sample(5000, tune=5000, njobs=4, target_accept=0.99)
Here are the warnings/erros I received:
Sampling 4 chains: 100%|██████████| 40000/40000 [04:55<00:00,
135.45draws/s] There were 2 divergences after tuning. Increasetarget_accept
or reparameterize. ERROR:pymc3:There were 2
divergences after tuning. Increasetarget_accept
or reparameterize.
The acceptance probability does not match the target. It is
0.7183184493954338, but should be close to 0.8. Try to increase the number of tuning steps.
There are other repeating warnings that pretty much say the same thing. For each chain I get warnings.
Below are the trace plots.
If I zoom in between 2000:3000 I get funky looking traces:
Here's an example pairplot between sigma_sub
and a_sub_7
. The red dots are the divergent points. I was inspired by the post by Thomas Wiecki.
Rhat(a) = 1.0014072110125143
Again, the traces look fine to me (except the zoomed in part) and r-hat is very close to 1. What am I doing wrong?
The data is quite noisy for this model, and the sample size is 17, not huge.
Would really appreciate some help here 🙂
Thanks!
Edit:
Using pm.sample(..., nuts_kwargs=dict(target_accept=0.95))
got rid of the divergencies.
Best Answer
You might have better luck on our discourse: https://discourse.pymc.io/
A couple of notes: You need to use:
pm.sample(..., nuts_kwargs=dict(target_accept=0.95))
instead to have thetarget_accept
be applied (this behavior is changed in line with what you tried in the next release). That might already solve your problem.If not, these problems do not look too terrible so you're probably fine. If you still wanted to investigate this further I would look where exactly the divergences appear. Arviz has some better plots for this, see here for one example: https://arviz-devs.github.io/arviz/examples/plot_parallel.html or https://arviz-devs.github.io/arviz/examples/plot_pair.html. Basically you want to find the dimensions where the divergences cluster. That tells you where the problem is.