Segmented Package in R – How to Fit Piecewise Linear Regression with One Breakpoint?

piecewise linearrregression

I have problems using the package segmented.

My linear regression is very simple, between offer and demand:

linearModel <- lm(demand~offer)

And so should be my model using "segmented":

piecewiseModel <- segmented(lm(demand~offer), seg.Z = ~ offer, psi = NA)

But I have an error message and I really cannot find a reason for this:

Error in seg.lm.fit(y, XREG, Z, PSI, weights, offs, opz) : 
  (Some) estimated psi out of its range

Here are my data:

demand  offer
1155    39.3
362 23.5
357 22.4
111 6.1
703 35.9
494 35.5
410 23.2
63  9.1
616 27.5
468 28.6
973 41.3
235 16.9
180 18.2
69  9
305 28.6
106 12.7
155 11.8
422 27.9
44  21.6
1008    45.9
225 11.4
321 16.6
1001    40.7
531 22.4
143 17.4
251 14.3
216 14.6
57  6.6
146 10.6
226 14.3
169 3.4
32  5.1
75  4.1
102 4.1
4   1.7
68  7.5
102 7.8
462 22.6
295 8.6
196 7.7
50  7.8
739 34.7
287 15.6
226 18.5
706 35
127 16.5
85  11.3
234 7.7
153 14.8
4   2
373 12.4
54  9.2
81  11.8
18  3.9

Best Answer

It appears that what is happening is that one of the estimates for a breakpoint is moving outside the range of offer during the fitting procedure. (You can deduce this by typing seg.lm.fit at the prompt and fighting your way through the code.) You can work around this by specifying PSI, the starting values for the breakpoints. I also set seg.control=list(stop.if.error=FALSE) to try to work around the problem, but that didn't help.

I reran your model with PSI=c(15) and it worked, but with PSI=c(15,25) I got an iteration count exceeded error message, which I could not overcome even by setting the maximum number of iterations to 1,000. I would take this as meaning that one breakpoint is all you're going to be able to estimate with this function and data.

As an extra note, if you plot(sqrt(demand)~offer), the relationship looks a lot closer to linear than just demand~offer, so you might want to try a transform of the data rather than a piecewise linear model.

Related Solutions

Splines – Piecewise Linear Regression with Knots as Parameters

Making the knots free parameters in the model turns the problem into a complex one not amenable to using standard estimation software. Computation of standard errors becomes very complex. Linear splines are very sensitive to where the knots are placed, and model "elbows" that are unlikely to be real unless $X=$ calendar time. Cubic splines have the advantages of (1) not having elbows because they have 3 orders of continuity, and (2) giving similar fits even if you move the knots around. Thus you can usually set knots based on quantiles of $X$ and not make knot estimation part of the optimization problem. Restricting the cubic regression splines to be linear in the tails (beyond the outer knots), called natural splines or restricted cubic splines, reduces the number of parameters to estimate and makes for more realistic fits.

This approach allows you to use standard estimation and hypothesis testing tools and does not require any special regression fitting functions, once you create the design matrix. Much more information is at Handouts under http://biostat.mc.vanderbilt.edu/CourseBios330. Once you fit the restricted cubic spline you can plot it along with confidence bands (which are obtained using standard methods also) and see slope changes. If you have special knowledge of regions of volatility you can put two knots closer together in that pre-specified region of $X$.

Solved – Piecewise regression with constraints

If the goal is simply to fit a function, you could treat this as an optimization problem:

y <- c(4.5,4.3,2.57,4.40,4.52,1.39,4.15,3.55,2.49,4.27,4.42,4.10,2.21,2.90,1.42,1.50,1.45,1.7,4.6,3.8,1.9)  
x <- c(320,419,650,340,400,800,300,570,720,480,425,460,675,600,850,920,975,1022,450,520,780)  
plot(x, y, col="black",pch=16)


#we need four parameters: the two breakpoints and the starting and ending intercepts
fun <- function(par, x) {
  #set all y values to starting intercept
  y1 <- x^0 * par["i1"]
  #set values after second breakpoint to ending intercept
  y1[x >= par["x2"]] <- par["i2"]
  #which values are between breakpoints?
  r <- x > par["x1"] & x < par["x2"]
  #interpolate between breakpoints
  y1[r] <- par["i1"] + (par["i2"] - par["i1"]) / (par["x2"] - par["x1"]) * (x[r] - par["x1"])
  y1
}

#sum of squared residuals
SSR <- function(par) {
     sum((y - fun(par, x))^2)
   }


library(optimx)
optimx(par = c(x1 = 500, x2 = 820, i1 = 5, i2 = 1), 
       fn = SSR, 
       method = "Nelder-Mead")

#                  x1       x2       i1       i2     value fevals gevals niter convcode kkt1 kkt2 xtimes
#Nelder-Mead 449.8546 800.0002 4.381454 1.512305 0.6404728    373     NA    NA        0 TRUE TRUE   0.06

lines(300:1100, 
      fun(c(x1 = 449.8546, x2 = 800.0002, i1 = 4.381454, i2 = 1.512305), 300:1100))

resulting plot

Best Answer

Related Solutions

Splines – Piecewise Linear Regression with Knots as Parameters

Solved – Piecewise regression with constraints

Related Question