MATLAB: Make code faster for import and array creation

MATLABplottingsorting

Dear All,
Here is a bit of code that I wrote to import a bunch of XY data in CSV form, extract the first column from every file and store it as any array (mydata_x), then extract the second column from every file and store it as an array (mydata_y). Then I made a third array that will order all the files for plotting (mydata_z)
It works, but it's unbelievable slow. My question is, how can I make it faster?
%bring the data in from CSV format into a cell-array
csvFiles = dir('*.csv'); %works for the current directory only
numfiles = length(csvFiles); %try to keep only the data you want in the dir
mydata = cell(1,numfiles); %finds the total number of files in the dir
%save each XY data file into a cell of a cell-array called 'mydata'
for k = 1:numfiles
mydata{k} = importdata(csvFiles(k).name);
end
%Now I need to generate the X, Y, and Z data
for i=1:numfiles
mydata_X(:,i) = [mydata{i}(:,1)] % get data from the first column from all cells in mydata
end
for i=1:numfiles
mydata_Y(:,i) = [mydata{i}(:,2)] % get data from the second column from all cells in mydata
end
for i=1:numfiles
arrayOfZ = 1:numfiles
mydata_Z(i,:) = arrayOfZ(1,:) % here I assign 1 through 'numfiles' to offest all spectra
end
%from here I can mesh(mydata_X, mydata_Y, mydata_Z)
Any help would be greatly appreciated!!
Jenna

Best Answer

Hi Jenna, I bet this is a pre-allocation issue.
When you build a big matrix one column at a time (such as is being done for mydata_X and mydata_Y), MATLAB needs to do some juggling of memory to make the variable a little bit bigger on each iteration. If you know how big the end result will be before you start building your matrix, you can initialise the matrix to this size just once, and MATLAB no longer has the overhead of juggling memory space inside the loop.
That said, there might even be a sneakier way of avoiding the loops you're using. Let's say that after you read all the files:
%bring the data in from CSV format into a cell-array
csvFiles = dir('*.csv'); %works for the current directory only
numfiles = length(csvFiles); %try to keep only the data you want in the dir
mydata = cell(1,numfiles); %finds the total number of files in the dir
%save each XY data file into a cell of a cell-array called 'mydata'
for k = 1:numfiles
mydata{k} = importdata(csvFiles(k).name);
end
At this point, is the matrix of every cell in mydata the same size? It seems that it should be, based on the rest of your code... If it is, you can make a 3D matrix with one "sheet" for each cell in mydata like this:
myData3d = cat(3, mydata{:});
And then you can get mydata_X like this:
mydata_X = myData3d(:,1,:);
Now, mydata_X here will be an N-by-1-by-NFILES matrix, whereas I think from your code that your original mydata_X ended up as a N-by-NFILES matrix. If you want to get the same as your original, you can just use reshape or permute:
mydata_X = permute(myData3d(:,1,:), [1 3 2]);
The same can be done for mydata_Y.
Did this get you going? If it works, it will be much faster that the loop that builds matrices one column at a time.
--
And also, the last loop you have:
for i=1:numfiles
arrayOfZ = 1:numfiles
mydata_Z(i,:) = arrayOfZ(1,:) % here I assign 1 through 'numfiles' ...
end
Can be replaced by:
mydata_Z = repmat(1:numfiles, numfiles, 1);
Although I'm not quite sure what purpose this variable serves... it will be of a different size to mydata_X and mydata_Y ...