[Math] MATLAB curve fitting – least squares method – wrong “fit” using high degrees

least squaresMATLABregression

Anyone here that could help me with the following problem?
The following code calculates the best polynomial fit to a given data-set, that is; a polynomial of a specified degree.

Unfortunately, whatever the data-set may be, usually at degree 6 or higher, MATLAB gets a totally wrong fit. Usually the fit curves totally away from the data in a sort of exponantial-looking-manner downwards. (see the example: degree = 8).

x=[1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5] % experimental x-values
y=[4.3 6.2 10.1 13.5 19.8 22.6 24.7 29.2] % experimental y-values

degree=8; % specify the degree
A = zeros(length(x),degree);
for exponent=0:degree;
for data=1:length(x);
   A(data,exponent+1)=x(data).^exponent; % create matrix A
end;
end;

a=inv((transpose(A)*A))*transpose(A)*y'; % a are the coëfficients of the polynom
a=flipud(a);
fitpolynoom=polyval(a,x);
error=sum((y-fitpolynoom).^2); % calculates the fit-error using the least-squares method
fitpolynoom=polyval(a,x);

figure;
plot(x,y,'black*',x,fitpolynoom,'g-');

error % displays the value of the fit-error in the Matlab command window

Thanks in advance.

Degree 8 - wrong MATLAB polynomial curve fitting

Best Answer

With the least squares method you try to 'solve' an system of linear equations $Ax = b$ for $x$, but if $A$ is not square (thats why you cannot solve exactly) the least square approach is $A^tAx = A^tb$. If you solve this straight forward this can generate huge numericl errors (everything is finde though if you solve it exactly, but Matlab can't do that.). That's why one normally uses the qr decomposition $A=QR$ where $Q$ is orthogonal and $R$ is a upper triangular matrix, so you can solve the least square problem by $Ax = QRx = b$ Since $Q$ is orthonormal you can solve this by $x = R^{-1} Q^t b$. I just applied it to your code and it seems to work very well: plot

Another thing to consider is that polynomials of 'high' degrees tend to oscilate very much even if the points are almost linear. A rule of thumb is that you should expect heavy oscilations f everythiing above degree 7 or 9.

%%
x=[1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5] % experimental x-values
y=[4.3 6.2 10.1 13.5 19.8 22.6 24.7 29.2] % experimental y-values

degree=8; % specify the degree
A = zeros(length(x),degree);
for exponent=0:degree;
for data=1:length(x);
   A(data,exponent+1)=x(data).^exponent; % create matrix A
end;
end;
[q,r] = qr(A);
a=r \ q'*y'; % a are the coëfficients of the polynom
a=flipud(a);

fitpolynoom=polyval(a,x);
error=sum((y-fitpolynoom).^2); % calculates the fit-error using the least-squares method
xx = min(x):0.01:max(x);
fitpolynoom=polyval(a,xx);

figure;
plot(x,y,'black*',xx,fitpolynoom,'g-');

error % displays the value of the fit-error in the Matlab command window
Related Question