Solved – Change Point detection with R and Python leads to different results

change pointpythonrtime series

I am performing change point detection on a dataset and I am interested in changes of the mean.

I perform the analysis with the Python library ruptures using the PELT method, the 'l2' cost function and a penalty value.

penalty = 2*np.log(len(data))*np.std(data)**2
algo = rpt.Pelt(cmodel='l2').fit(data)
bkps = algo.predict(pen=penalty)

Then I perform the analysis on the same dataset with the same method PELT and the same penalty value, this time using cpt.mean from the R library changepoints.

pelt = cpt.mean(data, method = 'PELT', penalty='Manual', pen.value='2*log(n) * sd(data)**2')

The two methods lead to different change points. Could someone help me understand why this happens?

EDIT

Data and code in Python:
https://ctruong.perso.math.cnrs.fr/ruptures-docs/build/html/index.html

import numpy as np
import ruptures as rpt

data = [0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 2, 1, 0, 0, 1, 0, 1, 0, 2, 0, 0, 1, 0, 2, 0, 1, 0, 0, 0, 0, 0, 1, 2, 2, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 2, 1, 0, 2, 3, 0, 0, 0, 3, 1, 0, 0, 2, 0, 3, 1, 2, 3, 3, 3, 0, 1, 1, 2, 1, 1, 3, 1, 0, 2, 3, 5, 0, 1, 1, 1, 3, 3, 1, 0, 3, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 6, 6, 11, 6, 6, 8, 6, 12, 11, 7, 9, 12, 21, 11, 28, 20, 20, 15, 26, 20, 22, 22, 15, 15, 13, 23, 15, 16, 11, 20, 17, 21, 10, 8, 9, 11, 7, 6, 10, 4, 4, 7, 11, 0, 0, 0, 0, 0, 0, 0, 0, 0, 7, 16, 13, 13, 4, 4, 14, 14, 11, 7, 9, 11, 16, 15, 5, 8, 17, 10, 12, 7, 13, 19, 21, 7, 14, 13, 11, 9, 18, 9, 15, 8, 4, 3, 2, 1, 2, 1, 2, 4, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 3, 1, 1, 0, 0, 1, 1, 0, 0, 2, 0, 0, 0, 0, 1, 0, 1, 1, 1, 3, 1, 2, 1, 1, 2, 9, 2, 10, 5, 1, 7, 7, 5, 9, 12, 8, 14, 10, 10, 12, 10, 1, 1, 2, 2]
data = np.asarray(data)

penalty = 2*np.log(len(data))*np.std(data)**2

algo = rpt.Pelt(model = 'l2').fit(data)
bkps = algo.predict(pen = penalty)
print(bkps[:-1])

# BREAKING POINTS
# [110, 120, 140, 160, 195, 255]

Same data and code in R:

https://cran.r-project.org/web/packages/changepoint/changepoint.pdf

https://www.rdocumentation.org/packages/changepoint/versions/2.2.2

library(changepoints)

data = c(0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 2, 1, 0, 0, 1, 0, 1, 0, 2, 0, 0, 1, 0, 2, 0, 1, 0, 0, 0, 0, 0, 1, 2, 2, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 2, 1, 0, 2, 3, 0, 0, 0, 3, 1, 0, 0, 2, 0, 3, 1, 2, 3, 3, 3, 0, 1, 1, 2, 1, 1, 3, 1, 0, 2, 3, 5, 0, 1, 1, 1, 3, 3, 1, 0, 3, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 6, 6, 11, 6, 6, 8, 6, 12, 11, 7, 9, 12, 21, 11, 28, 20, 20, 15, 26, 20, 22, 22, 15, 15, 13, 23, 15, 16, 11, 20, 17, 21, 10, 8, 9, 11, 7, 6, 10, 4, 4, 7, 11, 0, 0, 0, 0, 0, 0, 0, 0, 0, 7, 16, 13, 13, 4, 4, 14, 14, 11, 7, 9, 11, 16, 15, 5, 8, 17, 10, 12, 7, 13, 19, 21, 7, 14, 13, 11, 9, 18, 9, 15, 8, 4, 3, 2, 1, 2, 1, 2, 4, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 3, 1, 1, 0, 0, 1, 1, 0, 0, 2, 0, 0, 0, 0, 1, 0, 1, 1, 1, 3, 1, 2, 1, 1, 2, 9, 2, 10, 5, 1, 7, 7, 5, 9, 12, 8, 14, 10, 10, 12, 10, 1, 1, 2, 2)

pelt = cpt.mean(data, method = 'PELT', penalty='Manual', pen.value='2*log(n)*sd(d0)**2') 
change = cpts(pelt)
print(change)

# BREAKING POINTS
# 108 120 140 161 192 253

Best Answer

Surprisingly no answers were given yet. Here I tried to offer some biased opinions from my experiences with changepoint detection.

First of all, translating code from one lang to another is often tricky and error-prone. One example highlighting the difficulty is the reimplementation of a change detection algorithm called LandTrend, ported from IDL (an interactive lang similar to R and Python) to Java (GEE); the translated code gave almost the same results as before, but NOT IDENTIICAL (https://www.mdpi.com/2072-4292/10/5/691). Regardless, such inconsistencies are unlikely to be the true reason for what you observed for the PELT method, because the code base for the PELT method is relatively small.

I suspect two reasons for your case, one concerning the ill-posedness of your problem/data and another concerning the differing numerical libraries used behind R and Python. Below are more details.

(1) Your R and Python results are very close, which indicates your data/problem has multiple near-optimal solutions close to each other. Any minuscule numerical errors or data errors (e.g., slightly disturbing a datapoint with a very small noise) may shift the detected 'optimal' solution from one to another. Here is a made-up example to further explain. Suppose that the PELT algorithm tried to maximize a criterion; the result [110, 120, 140, 160, 195, 255] has a theoretical value of 0.4312 (I just made up this number), and the result [108 120 140 161 192 253] has a theoretical value of 0.4311. The two are very close. if you have a perfect computer with no numerical error, you can pick up the true best one (the one with 0.4312). But with all kinds of numerical errors such as round-off, truncation, and limited machine precisions, the algorithm may pick up either of them because, NUMERICALLY, the theoretically best one might have a worse optimized value than the other near-optimal ones. In some literature, this is known as model equifinality. In reality, there can be numerous solutions (more than two as explained here) that are almost equally good. I touched this problem briefly in a publication of mine (Figure 1 at https://go.osu.edu/beast2019).

(2) On top of the problem explained in (1), more often than not, Python and R use different math libraries (I mean, the blas and lapack libs for basic matrix and vector math operations and linear algebra). For example, by default, R uses the legacy fotran implementation, although other alternatives (e.g., Intel's MKL, and openBlas) can be customarily linked. The different libraries (plus when compiled for different CPUs or with different compiler flags) do not give identical results, despite that the results are sufficiently close in terms of machine precision. In the changepoint detection algorithm I developed (called Rbeast and available at https://github.com/zhaokg/Rbeast), I implemented my own version of blas for vector and matrix operation; the numerical results differ even on the same machine/CPU if I used different cpu instruction sets (e.g., SSE, AVX, and AVX512). Again by 'different', the results are almost the same but not identical (e.g., 0.3434313 vs 0.3434315). If accumulated throughout, these small errors can add up to be large enough to confuse the algorithm not to find the true best solution for the ill-posed problems explained in (1).

Now switching to the statistical point of view, your two solutions are probably not statistically different. If you are familiar with some model selection criteria such as AIC, a difference of AIC smaller than ~2.0 means that no statistical evidence suggests one model is better than another one. So, I assume that your Python solution and R solution should be equally good (again statistically speaking). Given this (i.e., model equifinality), Bayesian methods have been used to circumvent the problem a little bit. In R, bcp is a popular package, and my package Rbeast is also aimed to address similar problems. Here are some quick runs on your data using bcp and Rbeast.

data = c(0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 2, 1, 0, 0, 1, 0, 1, 0, 2, 0, 0, 1, 0, 2, 0, 1, 0, 0, 0, 0, 0, 1, 2, 2, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 2, 1, 0, 2, 3, 0, 0, 0, 3, 1, 0, 0, 2, 0, 3, 1, 2, 3, 3, 3, 0, 1, 1, 2, 1, 1, 3, 1, 0, 2, 3, 5, 0, 1, 1, 1, 3, 3, 1, 0, 3, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 6, 6, 11, 6, 6, 8, 6, 12, 11, 7, 9, 12, 21, 11, 28, 20, 20, 15, 26, 20, 22, 22, 15, 15, 13, 23, 15, 16, 11, 20, 17, 21, 10, 8, 9, 11, 7, 6, 10, 4, 4, 7, 11, 0, 0, 0, 0, 0, 0, 0, 0, 0, 7, 16, 13, 13, 4, 4, 14, 14, 11, 7, 9, 11, 16, 15, 5, 8, 17, 10, 12, 7, 13, 19, 21, 7, 14, 13, 11, 9, 18, 9, 15, 8, 4, 3, 2, 1, 2, 1, 2, 4, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 3, 1, 1, 0, 0, 1, 1, 0, 0, 2, 0, 0, 0, 0, 1, 0, 1, 1, 1, 3, 1, 2, 1, 1, 2, 9, 2, 10, 5, 1, 7, 7, 5, 9, 12, 8, 14, 10, 10, 12, 10, 1, 1, 2, 2)

library(bcp)
out = bcp(data)
plot(out)


library(Rbeast)

# Rbeast do time series decomposition and changepoint detection altogether. 
# season='none' is set below to indicate data has no periodic/seasonal variation.

out = beast(data,season='none') 
print(out)
plot(out)

enter image description here enter image description here

The first figure is from bcp and the second from Rbeast. The posterior probability curves (e.g., Pr(tcp)) shows the probability of changepoint occurrence.