[Math] How to fit data to a piecewise function

calculusleast squareslinear algebrapiecewise-continuitystatistics

My question today regards a set of data that I wish to fit a piecewise-defined continuous function. This data set covers a domain of x-values from $0$ to $\mu$ on the x-axis.

What I need is to determine a value $x_0 \in {\rm I\!R}+$ and two functions of given forms $f_1: [0, x_0] \mapsto {\rm I\!R}$ and $f_2: [x_0, \mu] \mapsto {\rm I\!R}$ such that $f_1$ and $f_2$ are both continuous and $f_1(x_0) = f_2(x_0)$ and $f_1 \cup f_2$ is the piecewise function of this type which best fits the data.

Can I please have some information/instruction on the background theory regarding how to do this? Would linear least squares and elementary multivariable calculus be enough given that $f_1$ and $f_2$ must both be smooth on $(0, x_0)$ and $(x_0, \mu)$ respectively?

In this case, I need to find $f_1 = ax^\frac{3}{2}$ for some $a \in {\rm I\!R}$ and $f_2 = bx + c$ for some $b, c \in {\rm I\!R}$ such that $b \in [-d,d]$ for some given $d \in {\rm I\!R}$; so the goal is essentially to find a piecewise function consisting of a power law curve and a line that is nearly horizontal which best fits the data I am looking at.

I would very much appreciate help on this problem if you would be so kind.

Best Answer

I don't know what software package you might want to use but here is one of many ways to estimate the parameters using R.

First note that for the resulting function to be continuous you need

$$a x_0^{3/2} = b x_0 + c$$

So we could take $c = a x_0^{3/2} - b x_0$.

# Generate data from a segmented model and plot it
  a = 2
  b = 0.125
  x = c(0:50)/10
  x0 = 2
  c = a*x0^(3/2) - b*x0
  set.seed(12345)
  y = (x<x0)*a*x^(3/2) + (x>=x0)*(b*x+c) + rnorm(51,0,0.5)
  plot(x,y, las=1)

# Function to calculate sum of squared distance from observed value
  ss = function(parameters, y) {
    a = parameters[1]
    b = parameters[2]
    x0 = parameters[3]
    c = a*x0^(3/2) - b*x0
    sum((y - ((x<x0)*a*x^(3/2) + (x>=x0)*(b*x + a*x0^(3/2) - b*x0)))^2)
  }

Now with some starting values (and good starting values are very valuable) find the estimates that minimize the sum of squares:

sol = optim(c(1.5,0.1,1.5), ss, y=y)
sol$par
#[1] 2.05963514 0.04598089 2.01290367

Plot the data and the fit:

Data and fit

Related Question