Change-Point – How to Detect Change Points in Data Analysis Using Advanced Techniques

change point

I have a specific question about the formulation of offline multiple change point detection given in Burg and Williams.

They formulate it as follows:

Where the change points are denoted $\{\tau_i\}$, and the slice of a time series from $a$ to $b$ is $\textbf{y}_{a:b}$. $\ell$ is a loss function, and $P$ is a penalty on the number of change points.

My question is: why is it written as $\ell(\textbf{y}_{\tau_{i-1}:\tau_{i}-1})$ and not $\ell(\textbf{y}_{\tau_{i-1}:\tau_{i}})$? In other words, why is there a second $-1$ term in the loss function?

Best Answer

So $\textbf{y}_{\tau_{i-1}:\tau_{i}}$ are the elements of the time series going from changepoint number $i-1$ going up to changepoint number $i$.

Then $\textbf{y}_{\tau_{i-1}:\tau_{i}-1}$ (notice the additional -1 is not a subscript of the $\tau$ but rather a subscript of the $\textbf{y}$). So this time series starts at changepoint number $i-1$ and goes up to one element before changepoint number $i$ ($\tau_i-1$).

This is because you want to calculate the loss function from one changepoint to the next with no overlap.