Solved – Fit a non-linear exponential model to using R

lmnonlinear regressionregression

I have interest rate data $r_k$ and the maturity times $k$ given and the problem I want to solve is that:

$\min_a\sum(r(a,k)-r_k)^2$,

where $a$ is a vector and $r(a,k)=a_0+a_1 e^{-k/a_3}+a_2 ke^{-k/a_3}$

The minimaziation problem is indeed a least squared method.
My question is how I can fit my given data $r_k$ to the model $r(a,k)$ using R?
I know for polynomail fit of degree 3 forexample you use lm(y~poly(x,3)). Is there any easy way doing the same for $r(a,k)$? Because it seems quite messy to minimize it by hand and finding the coefficients that way.

Best Answer

The procedure you will want to use is nonlinear least squares. This is somewhat less straightforward than linear regression but a lot of the basic intuition more-or-less carries over, with certain caveats.

The minimization problem must be solved iteratively; you supply starting values for the parameters (or a function that will do so), and in some cases derivatives (though with some software and on some problems it may not be necessary as numerical approximations may suffice).

The usual algorithms "head downhill"*, finding directions that should reduce the sum of squares of residuals rapidly. There may be multiple algorithms available. Sometimes the behavior of the problem is such that you have to help it -- e.g. sometimes reparameterization is useful; sometimes it can be useful to separate it into parts that are conditionally linear and parts that are not and working with each separately - some algorithms are even designed to exploit this more directly; sometimes you may just need to identify a better starting place.

* in some fashion, at least. I'm not going to go into detail about how nonlinear least squares algorithms pick which direction to head; for sufficiently nice functions they tend to converge quite quickly even from starting points far from the optimum, but you don't always have nice functions.

[There are many posts on CrossValidated on the nonlinear least squares topic, though there may be none with your exact problem.]

In R the command for nonlinear least squares is nls. The model is specified in a somewhat similar fashion to lm but unlike in lm or glm (where unless you have an offset, each predictor variable you name comes with a parameter to be estimated, so it's possible to refer to the parameters by their associated variable), the parameters are all explicitly named in the function you want to fit, and a variable in the function that's not available as a variable is assumed to be a parameter to be estimated.

The model formula would be written as rk ~ a0+(a1+a2*k)*exp(−k/a3), a data frame with rk and k in it would be supplied (or the two variables would have to be in the search path already but that would be an unusual way to do it), and you'd have to supply start values for a0, a1, a2, a3. I'm guessing for your particular application "typical" values could probably be supplied and may well suffice, at least for some parameters, though in some cases you may need to refer to the data to generate reasonable guesses.

Related Question