MATLAB: I have a cellarray (100000000*2cell) ,how can I save the cell in excel more quickly

performancesave a large cellarrytext filewrite

if i choose the fonction csvwrite(), i think that will be a long time to wait , do you have some good ideas saving it more quick? thanks

Best Answer

A good way to learn about performance is testing. Thus, I made a little comparison. I use a smaller array to save time. I think elapsed time increases linearly with array size (for large arrays).
>> double2file_performance
Elapsed time is 38.712062 seconds.
Elapsed time is 2.991493 seconds.
Elapsed time is 0.032539 seconds.
Elapsed time is 0.037617 seconds.
where double2file_performance is
M = [
4400002970000003533,8500000190000013093
4400002970000003533,8500000190000045501
4400002970000003533,8500000840000005660
4400002970000003533,8500000840000006008
4400002970000003533,8500090100000000354
4400002970000003533,8500090100000007316
4400002970000003533,8500090100000009112
4400002970000003533,8500090100000010547
8500000190000013093,8500000190000045501
8500000190000013093,8500000840000005660 ];
m1e6 = repmat( M, [1e5,1] );
tic
csvwrite( 'c:tmp\test.csv', m1e6 )
toc%%
tic
fid = fopen( 'c:tmp\test.txt', 'w' );
fprintf( fid, '%f,%f\n', m1e6 );
fclose( fid );
toc
tic
fid = fopen( 'c:tmp\test.bin', 'w' );
cnt = fwrite( fid, m1e6 );
fclose( fid );
toc
tic
save( 'c:tmp\test.mat', 'm1e6', '-v6' )
toc
Convert from a double to a cell array and back
tic, c1e6 = num2cell( m1e6 ); toc
tic, n1e6 = cell2mat( c1e6 ); toc
returns
Elapsed time is 1.073974 seconds.
Elapsed time is 1.679481 seconds.
and check the sizes of the arrays
>> whos
Name Size Bytes Class Attributes
c1e6 1000000x2 240000000 cell
m1e6 1000000x2 16000000 double
n1e6 1000000x2 16000000 double
&nbsp
Comments on this comparison
  • csvwrite and fprintf create text files. fwrite and save creates binary files. I used save(...'-v6') because it is a bit faster than default.
  • csvwrite is slow. That's partly because it is a wrapper of dlmwrite, which in turn is a wrapper of sprintf and fwrite.
  • fprint is an order of magnitude faster that csvwrite. I think fprint is the fastest way to write to a text file.
  • writing to a binary file is two orders of magnitude faster than fprint to a text file.
WARNING
The content of the text files depend on the precision specification used when writing. Of course it does! However, with cvswrite precision cannot be specified and the default is not appropriate in this case. The first lines of the file, test.csv, are
4.4e+18,8.5e+18
4.4e+18,8.5e+18
4.4e+18,8.5e+18
4.4e+18,8.5e+18
4.4e+18,8.5001e+18
4.4e+18,8.5001e+18
which might not be the expected result. With dlmwrite it is possible to specify precision.
Furthermore,
fprintf( fid, '%d,%d\n', m1e6 );
saves 30% in elapsed time and 25% in file size compared to
fprintf( fid, '%f,%f\n', m1e6 );
without loosing any precision.