Making the knots free parameters in the model turns the problem into a complex one not amenable to using standard estimation software. Computation of standard errors becomes very complex. Linear splines are very sensitive to where the knots are placed, and model "elbows" that are unlikely to be real unless $X=$ calendar time. Cubic splines have the advantages of (1) not having elbows because they have 3 orders of continuity, and (2) giving similar fits even if you move the knots around. Thus you can usually set knots based on quantiles of $X$ and not make knot estimation part of the optimization problem. Restricting the cubic regression splines to be linear in the tails (beyond the outer knots), called natural splines or restricted cubic splines, reduces the number of parameters to estimate and makes for more realistic fits.
This approach allows you to use standard estimation and hypothesis testing tools and does not require any special regression fitting functions, once you create the design matrix. Much more information is at Handouts under http://biostat.mc.vanderbilt.edu/CourseBios330. Once you fit the restricted cubic spline you can plot it along with confidence bands (which are obtained using standard methods also) and see slope changes. If you have special knowledge of regions of volatility you can put two knots closer together in that pre-specified region of $X$.
If the goal is simply to fit a function, you could treat this as an optimization problem:
y <- c(4.5,4.3,2.57,4.40,4.52,1.39,4.15,3.55,2.49,4.27,4.42,4.10,2.21,2.90,1.42,1.50,1.45,1.7,4.6,3.8,1.9)
x <- c(320,419,650,340,400,800,300,570,720,480,425,460,675,600,850,920,975,1022,450,520,780)
plot(x, y, col="black",pch=16)
#we need four parameters: the two breakpoints and the starting and ending intercepts
fun <- function(par, x) {
#set all y values to starting intercept
y1 <- x^0 * par["i1"]
#set values after second breakpoint to ending intercept
y1[x >= par["x2"]] <- par["i2"]
#which values are between breakpoints?
r <- x > par["x1"] & x < par["x2"]
#interpolate between breakpoints
y1[r] <- par["i1"] + (par["i2"] - par["i1"]) / (par["x2"] - par["x1"]) * (x[r] - par["x1"])
y1
}
#sum of squared residuals
SSR <- function(par) {
sum((y - fun(par, x))^2)
}
library(optimx)
optimx(par = c(x1 = 500, x2 = 820, i1 = 5, i2 = 1),
fn = SSR,
method = "Nelder-Mead")
# x1 x2 i1 i2 value fevals gevals niter convcode kkt1 kkt2 xtimes
#Nelder-Mead 449.8546 800.0002 4.381454 1.512305 0.6404728 373 NA NA 0 TRUE TRUE 0.06
lines(300:1100,
fun(c(x1 = 449.8546, x2 = 800.0002, i1 = 4.381454, i2 = 1.512305), 300:1100))
Best Answer
It appears that what is happening is that one of the estimates for a breakpoint is moving outside the range of
offer
during the fitting procedure. (You can deduce this by typingseg.lm.fit
at the prompt and fighting your way through the code.) You can work around this by specifyingPSI
, the starting values for the breakpoints. I also setseg.control=list(stop.if.error=FALSE)
to try to work around the problem, but that didn't help.I reran your model with
PSI=c(15)
and it worked, but withPSI=c(15,25)
I got an iteration count exceeded error message, which I could not overcome even by setting the maximum number of iterations to 1,000. I would take this as meaning that one breakpoint is all you're going to be able to estimate with this function and data.As an extra note, if you
plot(sqrt(demand)~offer)
, the relationship looks a lot closer to linear than justdemand~offer
, so you might want to try a transform of the data rather than a piecewise linear model.