I'm doing data analysis using Hamiltonian Monte Carlo for sampling from the posterior distribution of weights of a neural network. I'm using the Gelman-Rubin diagnostic estimated potential scale reduction (ESPR) for checking the convergence of my Markov chains. My neural network has around 317 model weights and I check the convergence of each of the 317 parameters separately.
If I have understood everything correctly the parameters should have converged if the ESPR value for each of them is < 1.1.
This indeed does happen in most of the parameters but some weights seem not to converge in a reasonable amount of time. Some take up to 100.000 or more samples until they converge, which takes too long time in my analysis.
My question is: "What is the appropriate way to proceed if the Markov chains do not converge in a reasonable amount of time? Do I just need to bite the bullet and wait for three months or so?"
Best Answer
To answer your original question