It is hard to say what exactly you did with the strucchange
package which lead to inaccurate results. If I use a regression model with a simple linear time trend (as was used to generate the data), breakpoints()
recovers the underlying structure. This is admittedly not very difficult here ... and breakpoints()
is not terribly fast because it is written only in R. However, you could use the foreach
plugin to speed it up somewhat on multiple cores (using doMC
) or a cluster (using doSNOW
).
In any case, to analyze your simple artificial data, I create a simple linear time trend auxiliary variable:
tr <- as.numeric(x - x[1], units = "days")
Then I try to find the breakpoints, yielding separate segments with different intercepts/slopes:
library("strucchange")
bp <- breakpoints(y ~ tr, h = 50, breaks = 8)
plot(bp)
This shows that both the residual sum of squares and the BIC drop sharply up to 3 breakpoints. The BIC is still somewhat lower for 4 breakpoints so that this would be selected by default. However, 3 breaks would probably be enough here.
The fitted models for 3 (red) and 4 (blue) breaks are visualized by:
plot(x, y)
lines(x, fitted(bp, breaks = 3), col = 2, lwd = 2)
lines(x, fitted(bp, breaks = 4), col = 4, lwd = 2)
The difference between 3 and 4 breaks is very small (albeit an improvement in terms of BIC). Hence, I would probably focus on the 3 breaks here, yielding intercepts/slopes close to the true values:
coef(bp, breaks = 3)
## (Intercept) tr
## 0.001 - 0.3 66.96409 100.28218
## 0.301 - 0.401 213.32564 -20.05220
## 0.402 - 0.702 41.83456 10.76780
## 0.703 - 1 641.02803 -48.51261
Unfortunately, breakpoints()
and its methods does not deal with POSIXct
time indexes, otherwise the annotation could be even nicer. You can extract the breakpoints, though, and match these to the underlying POSIXct scale:
breakpoints(bp, breaks = 3)
## Optimal 4-segment partition:
##
## Call:
## breakpoints.breakpointsfull(obj = bp, breaks = 3)
##
## Breakpoints at observation number:
## 300 401 702
##
## Corresponding to breakdates:
## 0.3 0.401 0.702
x[breakpoints(bp, breaks = 3)$breakpoints]
## [1] "2015-01-03 21:58:26 CET" "2015-01-05 00:23:56 CET"
## [3] "2015-01-08 00:49:45 CET"
The same could be done for the associated confidence intervals for the breakpoints:
confint(bp, breaks = 3)
## Confidence intervals for breakpoints
## of optimal 4-segment partition:
##
## Call:
## confint.breakpointsfull(object = bp, breaks = 3)
##
## Breakpoints at observation number:
## 2.5 % breakpoints 97.5 %
## 1 299 300 301
## 2 399 401 402
## 3 701 702 703
##
## Corresponding to breakdates:
## 2.5 % breakpoints 97.5 %
## 1 0.299 0.300 0.301
## 2 0.399 0.401 0.402
## 3 0.701 0.702 0.703
Best Answer
constant
means there is an intercept included in each equation in the model.trend
means there is a linear time trend included.both
means both of them are included.none
means neither of them is included.Type
?VAR
inR
to get an explanation. There you will find the algebraic form of the model provided. Pay attention to the CD_t term. You may also see the vignette for thevars
package, perhaps it could help.