MATLAB: Processing experimental data with polyfit

polyfit

I have this code, it works, but with some warnigs and it doesn't look like other codes I see here, because I am not a programmer, but it is very helpul and I want to share it in file exchange. So would you please share your knowledge to make it look and function better, Thanks
% this file reads data, extrapolates and generates fit curves with given degree of polyfit.
clear all
clc
D=22;% degree of polyfit
Y_axis=xlsread('F:\KT312',1,'b5:v42');
X_axis=xlsread('F:\KT312',1,'a5:a42');
for i=1:size(Y_axis,2)
Ys0=Y_axis(:,i);
[x,~]=find(isfinite(Ys0));
Xs=X_axis(x);
Ys=Ys0(x);
Xs2=linspace(Xs(1),Xs(end),100)';% extrapolation of the curve for a better precision
Ys2=spline(Xs,Ys,Xs2);
fitYs=polyfit(Xs2,Ys2,D);
Ysfit=zeros(size(Xs2));
for j=1:length(fitYs)-1
Ysfit(:,j)=fitYs(j)*Xs2.^(length(fitYs)-j);
end
Ysfit=sum(Ysfit,2)+fitYs(length(fitYs));
Ys2data(:,i)=Ys2;
Ysfit_data(:,i)=Ysfit;
Xs2_data(:,i)=Xs2;
end
clearvars -except Ys2data Ysfit_data Xs2_data
display 'done'

Best Answer

You asked for feedback...
So what does your tool offer that is innovative that someone might gain some benefit from?
  1. It reads in data. Generally having hard coded calls to read in data are terrible programming style. If someone else has their data in a different place, they need to modify your code - a bad idea.
  2. It tests for finite data, discarding infs. A good idea, I suppose. This is something anyone should do and know how to do in advance. You should probably be testing for NaNs also, as NaNs are commonly used to signify missing data.
  3. It uses linspace to "extrapolate" the data. WRONG. There is no extrapolation done in your code. And since it uses a fixed number of points, the result may actually be less fine of a grid, in case the real data involved a finer set of points.
  4. You interpolate the data with a psline, and then you use polyfit to approximate the spline? This is a really bad idea, since the spline will very possibly introduce artifacts into your data. Splines can often create ringing artifacts. Any noise in the data will be exagerated by the interpolation.
  5. It fits the curve using polyfit, but it fits a polynomial of order 22??????? This is literally insane, that high degree of a polynomial is almost always going to result in garbage fits, especially bad if the polynomial was fit to arbitrarily scaled data. polyfit is a useful tool to fit a straight line. But once you get past a quadratic, or maybe a cubic, you may be using the wrong tool, for the wrong reasons.
  6. Finally, it computes something called Ysfit, where you are raising things to powers as large as 100. Totally ridiculous. Something with literally no mathematical meaning, and certainly no mathematical/statistical value.
Some other general comments.
  1. No documentation is provided. So if someone wants to know what you are doing or why, they need to make some sort of educated guess. And if a highly experienced user and writer of code involving splines and other modeling tools has no clue as to what and why you did something, then how in the name of god and little green apples do you expect a novice user to be able to guess what the code does?
  2. You used a script. This alone is bad, since it creates new variables in the base workspace. It steps on variables that may already exist, overwriting the values they had. Of course, since the first line of code is to clear everything that the user might have had in their workspace, that also is unfriendly. It leaves behind junk in the workspace. LEARN TO USE AND WRITE FUNCTIONS!!!!!!!!!!!!!!!!!!!!
  3. The variables in your code have meaningless names, strings of characters that make no sense. Why are meaningful, readable, intelligent variable names important? They make your code easier to read and debug. They make it easier to use. And one day, when someone else in the universe might want to use your code (perhaps you get run over by the crosstown bus and someone else needs to take over and maintain your code base) they can do so. And it may not even that you are unable to work on your code. Imagine that next month or next year, you need to use and modify this code. Would you have any reason to remember what your code does, and why? I have MATLAB code that is pushing 35 years in age, and is still usable.
I'm sorry, but this script offers essentially no value to others. It does nothing innovative. It chains together many things that are each alone highly suspect. And yes, you will have every right to be upset at my "review" of what you have to offer, but don't kill the messenger just because they tell you something you don't want to hear.
How could you fix it?
  1. First, learn to write and use functions. That would be your important step to proceed beyond the point of novice programming.
  2. Breaking your code into functions means the code is itself easier to use, test, maintain, and modify as needed.
  3. Learn to document what you do. My recommended target is it would be good if EVERY line of code is both readable, but also has a comment attached that explains what it does. Comments are free, but are of incredible value to someone who will maintain the code. Since that target is a difficult one, I would insist on at least every significant block of code (every loop, for example, every test, etc.) have a comment explaining both the purpose of the block, as wel as an explanation in clear words as to how it was done, in case there is anything of significance. Essentially, if it took you more than a minute to write a code fragment, then it should have an explanatory comment attached to that fragment!
  4. Learn to use intelligent variable names. They make your code easy to read and follow.
  5. Learn to write help for the functions you provide. The help should explain what the inputs to those functions are. It should explain what the function returns, what it does.
  6. Advice that I often give out is to PLOT EVERYTHING. You plotted nothing. When you provide a plot, you provide visual feedback to the user. Did this code do something reasonable? Or is it insanity poured into a blender?
I'll admit there is a lot of similar random stuff posted on the file exchange. But does what you have written improve on anything someone would find? How? Why? What is innovative about what you did? What did you do that would make the life of someone easier? Or would someone need to spend more time trying to figure out what your code does and how to modify it, than it would take to just rewrite the mess from scratch?
Again, I know you won't be happy to hear what I have written. You asked for feedback.