Solved – Choice of a model for Bayesian Change Point Detection

bayesianchange pointhierarchical-bayesianpymc

I am getting my hands dirty with Probabilistic Programming using Bayesian approach to change-point detection. I read a number of tutorials provided with PyMC and reading the book by Cameron Davidson-Pilon "Bayesian Methods for Hackers". My main question is about the choice of the model for change-point detection. My secondary question relates to PyMC usage.

The data has been synthetically generated: 1000 data points with $\mu_1=3.0, \sigma_1=2.0$ and another 1000 data points with $\mu_2=2.0, \sigma_2=1.0$. This makes a single time-series data of 2000 data points with a single change-point in the middle.

I am attempting to answer the following two questions:

  1. Where is the change-point?
  2. What are the posteriors for the model parameters $\mu_1, \mu_2, \sigma_1, \sigma_2$?

The change-point has been modeled using Discrete Uniform distribution:
$\tau \sim \text{Uniform}(0,2000)$

The four parameters I am interested in are modeled using the Exponential distribution:

$\mu_1 \sim \text{Exp}(\alpha)$

$\mu_2 \sim \text{Exp}(\alpha)$

$\sigma_1 \sim \text{Exp}(\beta)$

$\sigma_2 \sim \text{Exp}(\beta)$

where the hyper-priors $\alpha$ and $\beta$ were set to $\frac{1}{\hat\mu}$ and $\frac{1}{\hat\sigma}$. The $\hat\mu$ and $\hat\sigma$ are computed from the 2000 data points.

Finally the observed data is modeled using the Normal distribution: $X \sim N(\mu,\sigma)$

Question: Is this model reasonable for this data and the task?


The Python code using PyMC:

Question: How to use the switch method correctly when there are more than one variable to select from?

import numpy as np
import matplotlib.pyplot as plt
import pymc3 as pm

from pymc3 import DiscreteUniform, Normal, switch, traceplot

mu1_true = 3.0
mu2_true = 5.0
sigma1_true = 2.0
sigma2_true = 1.0

n1 = 1000
n2 = 1000
n = n1 + n2

data1 = np.random.normal(mu1_true,sigma1_true,n1)
data2 = np.random.normal(mu2_true,sigma2_true,n2)
data = np.concatenate((data1,data2),axis=0)
t = np.arange(0,n)

#plt.plot(t,data)
#plt.show()

# hyper prior
alpha = 1.0/np.mean(data)
beta = 1.0/np.std(data)

model = pm.Model()

with model:
    switchpoint = DiscreteUniform("switchpoint", lower=0, upper=n)
    mu_1 = pm.Exponential("mu_1", alpha)
    mu_2 = pm.Exponential("mu_2", alpha)
    sd_1 = pm.Exponential("sd_1", beta)
    sd_2 = pm.Exponential("sd_2", beta)
    mu = switch( switchpoint >= t, mu_1, mu_2 )
    sd = switch( switchpoint >= t, sd_1, sd_2 )
    X = Normal('x',mu=mu,sd=sd,observed=data)

with model:
    step1 = pm.NUTS( [mu_1, mu_2, sd_1, sd_2] )
    step2 = pm.Metropolis( [switchpoint] )
    trace = pm.sample( 10000, step=[step1,step2] )

traceplot(trace,varnames=['switchpoint','mu_1','mu_2', 'sd_1', 'sd_2'])
plt.show()

The posteriors for the parameters of interest are clearly very close to the true values and not overlapping. The change-point position is inferred pretty accurately as well.

enter image description here

Best Answer

This all looks very reasonable to me. There's a bit of autocorrelation in the switchpoint posterior but I think with more samples it shouldn't be a concern. Then you might also want to discard the first 100 or so samples as burn-in. Other than that you're off to a great start!

Related Question