MATLAB: For a big matrix, how to accelerate fprintf

fprintf

Hello everyone, I have a 2500*1500 matrix and I want to print every column to a txt file, 5 numbers every row. Using :
for i=1:1500,
fprintf(fid, 'This is the %d coefficients\n', i);
S=sprintf(' %15.8E %15.8E %15.8E %15.8E %15.8E\n', coeff(:, i));
S(S=='E')='D';
fprintf(fid, '%s', S);
end
it will take several seconds. I'd like to know how can I accelerate this?

Best Answer

You have a few different speed constraints
  • the speed of formatting individual numeric items, but you are already using the fastest way
  • the overhead of calling fprintf() and sprintf() multiple times, which could potentially be reduced by formatting everything at one time and then writing it all
  • the cost of doing the substitution of 'E' to 'D', which possibly could be done more efficient (but your current version looks pretty good as-is)
  • the overhead of doing the substitution multiple times, which could potentially be reduced by building the output matrix and then doing the substitution all at once.
  • the cost of writing to disk, which you cannot get away from (except to touch up the buffering strategy, perhaps, as Jan shows)
You are not calling sprintf() irresponsibly such as with just one value at a time, so it is not obvious that there is a lot of overhead that could be cut by formatting everything at once.
Formatting everything at once is possible, but it drives up your memory costs a fair bit, to the point where you have to question whether the memory allocation costs of the large arrays are going to exceed the savings in overhead of calling sprintf() less often. Especially when you make the adjustments needed for your not always having a multiple of 5 items per column to display.
My tests show that regexprep() is roughly 16 times slower than your existing S(S=='E')='D' so you probably would have difficulty being more efficient on that portion.
With you already having cut down on overheads, and being stuck with the numeric formatting time and the file I/O time, I think you are already approaching as fast as you can reasonably get for that output format.