MATLAB: Strings from a text file to a matrix containing double precision floating numbers

character arraysdoublesMATLABstr2doublestringtext filestextscan

Hi
I have a text file containing a text header, and rows containing numeric values, with varying numbers of values, characters and numeric formats:
# Bundle file v0.3
9 2532
6.8302313857e+002 -1.4826175815e-001 8.1715222947e-002
9.3709731863e-001 -2.8772865743e-001 -1.9763814183e-001
194 144 45
5 6 1496 289.0000 199.0000 7 1235 308.0000 125.0000 5 1614 285.0000 163.0000 4 2122 173.0000 142.0000 0 911 148.5000 165.5000
2.4321163035e+000 -9.1469082482e-001 -6.6122261943e+000
219 194 76
I want to remove the header and store each of the numeric values in a matrix (padded out with NaNs to compensate for the dimensional differential). At present, I am using this code:
% open file and save contents to cell array, c
fid = fopen('C:\transform\bundle.out','r');
c = textscan(fid,'%s','delimiter', '','whitespace','');
fclose(fid);
%create m x 1 cell C and remove the header
C = c{1};
C(1,:)=[];
% convert C to a matrix using cell2mat / cellfun
maxLength=max(cellfun(@(x)numel(x),C));
out = cell2mat(cellfun(@(x)cat(2,x,zeros(1,maxLength-length(x))),C,'UniformOutput',false));
The problem with this approach is that it creates a character array where each row is a string meaning that I cannot use str2num or str2double to convert the numeric values to discrete doubles (i.e. it gives [] / NaN due to not passing the arithmetic number test). I.e. it produces:
'9 2532 ';
'6.8302313857e+002 -1.4826175815e-001 8.1715222947e-002 ';
'9.3709731863e-001 -2.8772865743e-001 -1.9763814183e-001';
rather than:
'9' '2532';
'6.8302313857e+002' '-1.4826175815e-001' '8.1715222947e-002';
'9.3709731863e-001' '-2.8772865743e-001' '-1.9763814183e-001';
I can work around this using by seperating each row into a row vector (e.g. out1,..,outn then using:
splitstring = textscan(out1,'%s');
splitstring = splitstring{1};
Then use str2double and flipdim or similar to return rows of doubles, then use vertcat and pad with NaNs to get the desired matrix, but this seems to be very wieldy in the coding department. Can anyone suggest a more simple way of getting the desired output? Any suggestions would be appreciated.
Thomas

Best Answer

I have worked out the answer for those with a similar problem:
I use textscan and cellfun to split the strings, de-nest and rearrange the output using vertcat and cellfun/transpose, then convert the single strings to doubles using cellfun/str2double:
fid = fopen('C:\transform\bundle.out','r');
c = textscan(fid,'%s','delimiter', '','whitespace','', 'HeaderLines', 1);
fclose(fid);
C = c{1};
C = cellfun(@(x) textscan(x,'%s','Delimiter', ' ')',C ,'UniformOutput',false);
Y = vertcat(C{:});
X = cellfun(@transpose,Y,'UniformOutput',false);
Z = cellfun(@str2double,X,'UniformOutput',false);
The output can be gained using cellfun/cell2mat using a max row length id (maxLength):
maxLength=max(cellfun(@(x)numel(x),Z));
out = cell2mat(cellfun(@(x)cat(2,x,zeros(1,maxLength-length(x))),Z,'UniformOutput',false));
Note this code pads out the values with zeros rather than NaNs.