PyMC3 – Resolving Pymc3 SamplingError: Initial Evaluation of Model at Starting Point Failed

pymcpymc3

I am working on a Bayesian Cox Proportional Hazard model. I've started by implementing and running the Bayesian CPH example at https://docs.pymc.io/en/stable/pymc-examples/examples/survival_analysis/survival_analysis.html

The example works fine, however, I've tried to extend that example to my own simulated dataset and I'm running into an issue. The primary difference between the example and my code is that in my code I have 20,000 simulated patients and in the example they have 44 patients. The error I'm getting is:

SamplingError: Initial evaluation of model at starting point failed!
Starting values:
{'lambda0_log__': array([4.60517019, 4.60517019, 4.60517019, 4.60517019, 4.60517019,
       4.60517019, 4.60517019, 4.60517019, 4.60517019, 4.60517019,
       4.60517019, 4.60517019, 4.60517019, 4.60517019, 4.60517019,
       4.60517019, 4.60517019, 4.60517019, 4.60517019, 4.60517019,
       4.60517019, 4.60517019, 4.60517019, 4.60517019, 4.60517019]), 'beta': array(1.)}

Initial evaluation results:
lambda0_log__   -25.00
beta             -0.92
obs               -inf
Name: Log-probability of test_point, dtype: float64

The model is shown below in code and graphically:

with pm.Model(coords=coords) as model:

   lambda0 = pm.Gamma("lambda0", 1, 0.01, dims="intervals")
   beta = pm.Normal("beta", 1, sigma=1)

   lambda_ = pm.Deterministic("lambda_", T.outer(T.exp(beta * df.sev_class), lambda0))
   mu = pm.Deterministic("mu", exposure * lambda_)

   obs = pm.Poisson("obs", mu, observed=death)

Bayesian Model

The error occurs when taking the sample with the code below:

n_samples = 1000
n_tune = 1000
RANDOM_SEED = 8927
with model:
    idata = pm.sample(
        n_samples,
      step=pm.NUTS(),
        tune=n_tune,
        target_accept=0.99,
        return_inferencedata=True,
        random_seed=RANDOM_SEED,
    )

If the error isn't due to some obscure MCMC sampling implementation detail, I'm going to wonder if it doesn't have something to do with the following line:

with pm.Model(coords=coords) as model:

    ... removed code ...
    obs = pm.Poisson("obs", mu, observed=death)

The observed=death line is I believe the primary way the model is experiencing the 20,000 patients. It's not exactly clear to me how that line works. The death variable is a 9×20000 sized matrix. The dimension of 9 represents the 9 intervals being modeled. The 20,000 is the 20,000 patients. The matrix is mostly filled with zeros meaning no death for the given patient in the given interval, with the occasional one indicating an observed death. I feel if I understood how that matrix of observations was being used by the Poisson's distribution I might get closer to understanding what's going on. …on the other hand that may have nothing to do with it.

Any thoughts would be greatly appreciated!

Best Answer

It appears the issue had nothing to do with any of the model related topics discussed above. In fact it was a simple bug. Specifically I had overwritten the T from the line:

from theano import tensor as T

The moral of the story (at least for me), is that if you get a "SamplingError: Initial evaluation of model at starting point failed" message, don't think about sampling or models or starting points, look for straight up bugs.

Related Question