MATLAB: Simple function constrainting a polyfit with lsqlin is giving strange fits

constraintleast-squarelsqlinMATLABOptimization Toolboxpolyfit

function [P] = polyfit_cnstred(t,y,n)
%POLYFIT_CNSTRED OUTPUT: Polynomial coefficients of the polynomial fit that
% is constrainted to pass through the first value of Y. The function LSQLIN
% uses the optimization toolbox.
C       = [ ];
Aeq     = [ ];
for i = n:-1:0
    C   = [ C   t.^i    ];
    Aeq = [ Aeq t(1).^i ];
end
d       =   y;
% x       =   polyfit(t,y,n);
A       = [ ];
b       = [ ];
beq     = [ y(1)    ];
P       =   lsqlin(C,d,A,b,Aeq,beq);
end

But the result seems to be wrong, or maybe I haven't understood how the lsqlin works. The plot i get is shown in the figure. As explained in the code of the function, the polyfit is constrained to pass through the first value of the data. My guess was that the fitted curve would continue to pass through the data points, trying to minimize the distance between the fitted curve and the data points. The result is obviously different from my guess.

Any suggestions or explenations? I don't have extensive experience in matlab so maybe there's something wrong with the code.

Best Answer

Is this homework? It seems more sophisticated than the classical homework assignment, so I will guess not. But, sorry, just throwing lsqlin at this is a bad idea.

My gut tells me a big part of your problem MAY be that you are using too high an order polynomial. What is n here? Actually though, in context, n need not be that very large. Why not? Because you will have MASSIVE numerical problems, even for small n. That is, n is on the order of 7.3e5. Now, look at the matrix you will create for n as small as 3.

You are raising x to the 3rd power in one column of C. But the last column of C is all ones. Something you need to understand about linear algebra and double precision arithmetic, is that things go to hell when your numbers vary immensely. But worse, all values of x are almost identical. So the columns of x all look very much alike. Fairly close to being constant all down the columns.

You have not attached your data. But try computing the rank of C. It will be a seriously large number, even for small n.

So, how can you solve this? Actually, the answer is trivial, as long as you are willing to do some transformations.

And, I see you have attached your data! ... Just a second ...

[t,y]
ans =
     737088                      67.2
     737092                      67.2
     737094                      68.4
     737095                      67.2
     737100                      67.2
     737104                      70.5
     737105                      68.3
     737109                      69.8
     737111                      70.4
     737112                        69
     737113                      70.5
     737115                      68.9
     737116                      68.4
     737117                      70.4
     737119                        71
     737120                        71
     737121                        71
     737122                      69.7
     737123                      70.7
     737125                      70.7

And, it looks like n is 1 in the A.mat file? Gosh, how easy can this be? With n==1, the problem is trivial.

For n==1, the problem is of the form

y = a*t + b

but you wish to constrain it so that y(737088)=67.2, and fit exactly that point with no error. Assuming you have a recent release of MATLAB, do this:

C = t.^[1 0];
Aeq = [t(1),1];
beq = y(1);
coef = lse(C,y,Aeq,beq)
coef =
      0.0994068704175685
       -73204.4113023447
plot(t,y,'ro',t,coef(1)*t + coef(2),'-b')

Yes, I used my own lse code, downloadable from the file exchange. It looks like lsqlin had some numerical problems, probably due to the poor scaling of your variables, and gave up too easily.

Download LSE here

LSE will be less sensitive. But even so, larger values of n will cause serious hiccups. For example, suppose you tried n==3?

n = 3;
C3 = t.^[n:-1:0];

I need only look at the first two rows of C. In fact, every row is almost identical. That will play hell with any least squares fit.

C3(1:2,:)
ans =
    4.00458966738665e+17  543298719744    737088      1
    4.00465486358683e+17  543304616464    737092      1

And you can see that the numerical rank of C3 is only 2.

rank(C2)
ans =
     2

In fact, had you even tried to fit this with a quadratic polynomial, it would fail miserably. The rank of C2 is 2, but it has 3 columns. It looks like lsqlin died even when n==1.

Now, could you have solved a similar problem, for larger n? Well, yes. In fact, the problem is solvable trivially, even without lse or lsqlin.

Suppose we wanted to find the cubic polynomial fit, but one that is forced to pass through (t(1),y(1))?

The trick is to recognize that a polynomial with no constant term passes through the origin. That helps us, because we also need to deal with those HUGE numbers on the t axis. Raising numbers like 7.3e5 to even moderately small powers will cause numerical problems.

So transform the problem. Fit this polynomial model instead:

y - y(1) = a*(t-t(1))^3 + b*(t-t(1))^2 + c*(t-t(1))

Look what happens when t==t(1)? The right hand side is zero, identically so. That means at that point, we will get y==y(1).

So, how will we do that fit?

that = t - t(1);
yhat = y - y(1);
abc = that.^(3:-1:1)\yhat
abc =
   -6.42178429935585e-05
      0.0031802721610219
      0.0638830885069514

Note that MATLAB did not complain at all. That is because the matrix is now well behaved.

that.^(3:-1:0)
ans =
         0           0           0           1
        64          16           4           1
       216          36           6           1
       343          49           7           1
      1728         144          12           1
      4096         256          16           1
      4913         289          17           1
      9261         441          21           1
     12167         529          23           1
     13824         576          24           1
     15625         625          25           1
     19683         729          27           1
     21952         784          28           1
     24389         841          29           1
     29791         961          31           1
     32768        1024          32           1
     35937        1089          33           1
     39304        1156          34           1
     42875        1225          35           1
     50653        1369          37           1
cond(that.^(3:-1:0))
ans =
        94939.1485014821

So not poorly conditioned, as things go. Plot the fit:

plot(t,y,'ro',t,polyval([abc;y(1)],t-t(1)),'-b')

So, I had to change how the model is evaluated. polyval can still do it, but with a twist, that of subtracting off t(1) from t, and by putting the constant term back into the polynomial coefficients.

Related Solutions

MATLAB: How to use lsqlin for linear contraint through points and constrained derivative at those points

A general curve fit, IF the curve is nonlinear, will not always be trivial to enforce a constraint. You would need to use a tool like fmincon.

But if you are asking how to fit a polynomial, as those links talk about, then is there some (ANY?) rational reason why you think you need to use a polynomial for this?

The most common reason is you simply don't know of any other fitting tool. The answer to that is to use my SLM toolbox , which lets you do EXACTLY as you desire, plus, it has many more abilities.

Yes, it is trivial to use lsqlin for this purpose. But a sufficiently high order polynomial to give you any real flexibility will be too high an order to cause problems in the fit. High order polynomials are NEVER a good choice.

Do I really need to show you how to solve this using a polynomial fit? Sigh. It is just a bad idea in general. Use SLM instead. Really. You WILL be happier with the result.

Assume the polynomial has order (highest degree) n, and that at some given point a, f(a)=v, and f'(a)=d.

I'll make up some random data (crap data, but who cares?) This is just an example. I'll do it as a quadratic polynomial.

x = rand(10,1);
y = rand(10,1);
n = 2;
a = 0.5;
v = 0.6;
d = 0.1;
p = n:-1:0;
M = bsxfun(@power,x(:),p);
Aeq = [a.^p;p.*a.^max(0,p-1)];
Beq = [v;d];
coef = lsqlin(M,y(:),[],[],Aeq,Beq);
coef
coef =
     1.0934
   -0.99337
    0.82334

Do ya trust me?

polyval(coef,a)
ans =
        0.6
polyval(polyder(coef),a)
ans =
        0.1

Got lucky I guess. Even so? Use SLM. You WILL be happier with the result.

MATLAB: How do you constrain values in x using lsqlin

 Aeq=zeros(size(x));
 Aeq(1:7)=1;
 beq=0;

Best Answer

Related Solutions

MATLAB: How to use lsqlin for linear contraint through points and constrained derivative at those points

MATLAB: How do you constrain values in x using lsqlin

Related Question