MATLAB: Curve fitting error:nlinfit rank deficient

nlinfit

this code is for T1 fitting for phantom
tr=[32 64 128 256 512 1020 2050];
fun1=@(p,x) p(1).*(1-exp(-x./p(2)))+p(3);
si1=[246.672672672673 395.729729729730 605.585585585586 890.261261261261 1221.54654654655 1529.04204204204 1719.57357357357];
p0= [0,si1(end),50];
options = statset('Funvalcheck','off');
beta0 = p0;
[beta1,R,J,CovB,MSE] = nlinfit(tr,si1,fun1,beta0);%nonlinear fitting
and the answer is Rank deficient, rank = 2, tol = 5.904049e-15.
what's wrong?

Best Answer

Whats wrong? Mainly really poor starting values. Not enough data is another good reason, and a model that is not easily fit on data that varies over multiple orders of magnitude.
Other problems? A better algorithm is often a good choice. You might also consider using a weighted nonlinear regression.
Here is my fit, using a better algorithm for this class of problem.
funlist = {@(c,xdata) 1 - exp(-xdata/c), 1};
[INLP,ILP] = fminspleas(funlist,1719,tr,si1)
INLP =
427.15
ILP =
1528.4
175.6
plot(tr,si1,'o')
trev = 32:2050;
hold on
plot(trev,ILP(1)*(1-exp(-trev/INLP)) + ILP(2),'r-')
grid on
The fit here seems reasonable. Still not nearly sufficient data to do the fit well.
I used my own fminspleas (it lives on the file exchange, for user download.) It is better for this kind of problem because it is more robust to poor starting values.
So what is wrong here? Why did nlinfit fail? Look at the initial parameters you chose,
[0, 1719, 50] % start point
[1528.4, 427.8, 175.6] % estimated values
So when I used fminspleas, it has the ability to think about this as a ONE variable nonlinear optimization. p(1) and p(3) are intrinsically linear parameters, that can be estimated conditionally on the value of p(2). So fminspleas is able to succeed even when nlinfit gave up the ghost. But see how far away it had to move for the nonlinear parameter from your starting values. nlinfit has it much harder, since you gave it crap to start with in the estimation.
Could you have found more intelligent starting values? Probably so. Given this model...
fun1=@(p,x) p(1).*(1-exp(-x./p(2)))+p(3);
we see that at x == 0, fun1 will return exactly p(3).
As a lmit, as x--> inf, the function will approach p(1) + p(3).
If we look at the plot of the data, I might have guessed p(3) should have been roughly 100, or maybe even a little higher.. You tried 50. Regardless, that was not a terrible choice for p(3).
But if I had to guess a value for the asymptote as x approaches inf. I might have guessed roughly 1800, just from the plot.
So if we could have predicted
p(3) ~ 100
p(1) + p(3) ~ 1800
then your starting values were clearly way off. p(1) should be something like 1700, just looking at the data.
How about the starting value for p(2)? Lets see. An exponential rate parameter is easy to estimate. What is the point where a negative exponential has died off by say 90%? That is, can we find the value where
exp(-u) = 0.1
Easy.
-log(.1)
ans =
2.3026
So, if I had to guess a start point for p(2), since the exponential has died by about 90% at x==1500, then we can find an very rough initial value for p(2) as
p2init = -1500/log(0.1)
p2init =
651.44
Lets try nlinfit now.
fun1=@(p,x) p(1).*(1-exp(-x./p(2)))+p(3);
p0= [1700,-1500/log(0.1),100];
options = statset('Funvalcheck','off');
beta0 = p0;
[beta1,R,J,CovB,MSE] = nlinfit(tr,si1,fun1,beta0);%nonlinear fitting
beta1
beta1 =
1528.4 427.15 175.6
Amazing. It worked, and did not even complain. ;-)
Better (more) data would be useful. And the model does not seem to fit terribly well, with not really the correct shape. But it is way better than the sharp stick in the eye that you got when you tried it. The biggest lessons you need to take from this are to use a better algorithm to fit exponential models, or if you are going to use an algorithm that cannot handle poor starting vlaues, then choose more intelligent starting values. Otherwise, expect to see crap out the end far too often.
What could you do to improve this mess? For example, if I were your own consultant, I would tell you to get better data first. The process has not yet gone far enough to know where it will level off. For example, let me use a constrained regression spline model to fit your data really well. I'll let it tell me where the data seeem to want to level off,
slm = slmengine(tr,si1,'knots',[0 50 100 200 400 800 1600 3200 5000],'increasing','on','concavedown','on','leftminvalue',0,'plot','on');
I used my SLM toolbox for the fit, providing only information that seemed logical from what I can see. That is, we have a curve that is a monotone increasing function, always non-negative, but always a negatively curved function.
As you can see, the spline model fits quite well compared to the negaive exponential. But it does not seem to know where to level the curve off on the top end. To me, that tells me you need a bit more data on the top end. A little more data down near zero would help too, in order to better resolve the behavior at both ends, as well as to pin down the rate parameter more accurately. It is the data near each end of the curve that provides the most information about p(1) and p(3) in your model, and the data in the middle that are telling the estimation the value of the rate parameter.
Is there a better choice of model? Perhaps. But I don't really know if you had a good reason to choose that model, or if you just thought it might work well. It is ok, but has too much lack of fit for my eyes. One option might be...
ft = fittype('a + b/(1+ exp(-x/c))')
ft =
General model:
ft(a,b,c,x) = a + b/(1+ exp(-x/c))
mdl = fit(tr',si1',ft,'startpoint',[-1000 2900 600])
mdl =
General model:
mdl(x) = a + b/(1+ exp(-x/c))
Coefficients (with 95% confidence bounds):

a = -1204 (-1513, -893.7)
b = 2871 (2477, 3264)
c = 284.1 (187.6, 380.6)
plot(tr,si1,'o'),hold on
plot(mdl)
Sigh. Not really any better in terms of fitting the roll over. Well, that was only one of several classic forms you might have tried. Here is another:
ft = fittype('a + b*erf(x/c)')
ft =
General model:
ft(a,b,c,x) = a + b*erf(x/c)
mdl = fit(tr',si1',ft,'startpoint',[0 1800 1000])
mdl =
General model:
mdl(x) = a + b*erf(x/c)
Coefficients (with 95% confidence bounds):
a = 244.2 (73.16, 415.3)
b = 1405 (1177, 1633)
c = 664.1 (415.2, 913)
plot(mdl)
hold on
plot(tr,si1,'o')
untitled.jpg
Even more so, just too much lack of fit. What it tells me is your data is not rolling over sufficiently well to pin down the shape of that curve.
Still, I'd just use a spline. AND get better/more data where it counts.