The changepoint methods such as the strucchange
package assume that the series occur in an ordered manner. Thus even though you have several observations at a single time point, the order in which they occur in your data table matters. Having said this the strucchange
package works well on the series above identifying the correct (in some sense) observations. Strucchange doesn't work on data tables so converting to a matrix we get
set.seed(10)
dat <- matrix(c(rep(0,30), rep(5,30), rep(10,30), rep(15,30), rep(20,30), rnorm(30, 5, .5), rnorm(30, 4, .5), rnorm(30, 3, .5), rnorm(30, 2, .5), rnorm(30, 1, .5)),ncol=2)
library(strucchange)
breakpoints(dat[,2]~dat[,1])
No changepoint is identified. Now for dat2:
dat2 <- dat
change <- which(dat2[,1] %in% 10)
dat2[change,2] <- dat2[change,2] + 1
change <- which(dat2[,1] %in% 15)
dat2[change,2] <- dat2[change,2] - 1
breakpoints(dat2[,2]~dat2[,1])
Optimal is 2 changepoints at 30 and 90 (0-30 is the slightly higher mean of 5 in t=0, 31-90 have mean 4 and 91-150 have mean 1).
Now for dat3:
dat3 <- dat
change <- which(dat3[,1] %in% 0)
dat3[change,2] <- dat3[change,2] - 3
breakpoints(dat3[,2]~dat3[,1])
Here we have 1 changepoint at 59 which suggests that the first two form an upward trend followed by the remaining which form a downward trend.
This behaviour of the incorrect changepoint is due to the fact that only have 1 time observation before the change. Modifying this to 2 we estimate correctly:
dat4=rbind(matrix(c(rep(-5,30),rnorm(30,2,0.5)),ncol=2),dat3)
breakpoints(dat4[,2]~dat4[,1])
1 changepoint at 60.
You are trying to fit a change in the regression relationship. In some sense, repeated observations are not an issue, this gives you are handle on the variability. What is most important is that you have enough information to estimate the relationship prior to the change and also after the change as dat3 and dat4 demonstrate (albeit simply).
Best Answer
You can try out the changefinder library on PyPI. The description says that it's an online Change Detection Library based on the ChangeFinder algorithm
There are also some Python implementations of Michele Basseville's Statistical Change Point Detection techniques available in tutorial format on this Github repo.