I've got bitten by this recently. My intentions were the same, make some artificial model and test it. The main reason is the one given by @whuber and @marco. Such model is not identified. To see that, remember that NLS minimizes the function:
$$\sum_{i=1}^n(y_i-a-br^{x_i-m}-cx_i)^2$$
Say it is minimized by the set of parameters $(a,b,m,r,c)$. It is not hard to see that the the set of parameters $(a,br^{-m},0,r,c)$ will give the same value of the function to be minimized. Hence the model is not identified, i.e. there is no unique solution.
It is also not hard to see why the gradient is singular. Denote
$$f(a,b,r,m,c,x)=a+br^{x-m}+cx$$
Then
$$\frac{\partial f}{\partial b}=r^{x-m}$$
$$\frac{\partial f}{\partial m}=-b\ln rr^{x-m}$$
and we get that for all $x$
$$b\ln r\frac{\partial f}{\partial b}+\frac{\partial f}{\partial m}=0.$$
Hence the matrix
\begin{align}
\begin{pmatrix}
\nabla f(x_1)\\\\
\vdots\\\\
\nabla f(x_n)
\end{pmatrix}
\end{align}
will not be of full rank and this is why nls
will give the the singular gradient message.
I've spent over a week looking for bugs in my code elsewhere till I noticed that the main bug was in the model :)
Non-linear least squares solves $min_\beta \sum (y_i-f(x_i;\beta))^2$. This is quadratic in $\beta$ if $f$ is linear in $\beta$. Your $f$ is not linear in $\beta$, so the NLS objective function is not quadratic in $\beta$. Of course, you don't need the function to be quadratic to guarantee convergence to a unique minimum, rather you need $min_\beta \sum (y_i-f(x_i;\beta))^2$ to be convex in $\beta$. Presumably, with your $f$, the NLS objective function is not convex. It doesn't look, to me, like the kind of $f$ which generates a convex objective function. That's pretty much the explanation. You can have lots of minima or one minimum.
If I were fitting the function that you are, I would use an entirely different approach. I would not just blindly use NLS. If you look carefully at your function, $f(x_i;\beta)=a*x_i^2exp(-bx_i)+c$ it is almost linear in the parameters. If you fixed $b$ at some value, say 0.1, then you could fit $a$ and $c$ by OLS:
\begin{align}
y_i &= a*x_i^2exp(-0.1x_i)+c \\
&= a*z_i+c
\end{align}
The variable $z_i$ is defined $z_i=x_i^2exp(-0.1x_i)$. This means that, once you have picked $b$, the optimal value of $a=\widehat{Cov}(y,z)/\hat{V}(z)$ and the optimal value of $c=\overline{y}-a*\overline{z}$.
So what, right? At the very least, this is how you should pick starting values for $a$ and $c$. But, really, this reduces the search for optimal parameters to a one dimensional search over $b$. With a modern computer, one dimensional searches are fast and easy. If you have some idea of what reasonable values for $b$ are, then you can just define an interval $[b_{low},b_{high}]$ and grid search for the b which gives the lowest sum of squared errors. Then use that $b$ and its associated optimal $a$ and $c$ to start NLS from.
Or, you could do something more sophisticated. Suppose you are searching over $b$, using the optimal $a(b)$ and $c(b)$ from OLS. Then the NLS objective function is $\sum \left(y_i - f(x_i;a(b),b,c(b))\right)^2$. The envelope theorem makes the derivative of this very easy to calculate:
\begin{align}
\frac{d}{d b} \sum \left(y_i - f(x_i;\beta)\right)^2 &= \sum 2\left(y_i - f(x_i;\beta)\right)\frac{d}{d b}f(x_i;\beta)\\
&= \sum 2\left(y_i - f(x_i;\beta)\right)(-abx_i^2exp(-bx_i))
\end{align}
So, you can easily write a function to calculate the NLS objective function for any given $b$ and you can easily write a function to calculate the derivative of the NLS objective function for any $b$. These two ingredients are enough to get a optimizer going on your function. Then, after you find the optimal $b$, just run NLS with that $b$ and its associated optimal $a$ and $c$. It will converge in one iteration.
Best Answer
Your problem seems to stem from a typo/mistake in your equation.
Setup:
Let's see what the initial parameter values look like:
This looks OK.
The easiest and most reliable way to fit the model is to use
SSfpl()
again: among other things, this also provides gradients for the fitting procedure.Your equation was
b1 + ((b2-b1)/(1 + exp(b3*(x-b4))))
: this makes two mistakes (1) it switches the third (xmid
) and fourth (scal
) parameters; (2) it multiplies rather than dividing by thescal
parameter (it also switches the first and second parameters, lower and upper asymptotes, but that seems relatively harmless). This works fine: