I got a strange problem related to the reading in of space delimited text files. I’ve been attempting to read in several hundred blocks of data that look like this;
NFL58 23Mar2012 Show 2 1 01 0000000001 Low001 25.187466 156.162447 21578.188 97.134234 stops AAAAA 1 A10A1.100000e-01 1.200000e-01 1.300000e-01 1.400000e-01 1.500000e-01 1.600000e-01 1.700000e-01 1.800000e-01 1.900000e-012.100000e-01 2.200000e-01 2.300000e-01 2.400000e-01 2.500000e-01 2.600000e-01 2.700000e-01 2.800000e-01 2.900000e-013.100000e-01 3.200000e-01 3.300000e-01 3.400000e-01 3.500000e-01 3.600000e-01 3.700000e-01 3.800000e-01 3.900000e-014.100000e-01 4.200000e-01 4.300000e-01 4.400000e-01 4.500000e-01 4.600000e-01 4.700000e-01 4.800000e-01 4.900000e-015.100000e-01 5.200000e-01 5.300000e-01 5.400000e-01 5.500000e-01 5.600000e-01 5.700000e-01 5.800000e-01 5.900000e-016.100000e-01 6.200000e-01 6.300000e-01 6.400000e-01 6.500000e-01 6.600000e-01 6.700000e-01 6.800000e-01 6.900000e-017.100000e-01 7.200000e-01 7.300000e-01 7.400000e-01 7.500000e-01 7.600000e-01 7.700000e-01 7.800000e-01 7.900000e-018.100000e-01 8.200000e-01 8.300000e-01 8.400000e-01 8.500000e-01 8.600000e-01 8.700000e-01 8.800000e-01 8.900000e-019.100000e-01 9.200000e-01 9.300000e-01 9.400000e-01 9.500000e-01 9.600000e-01 9.700000e-01 9.800000e-01 9.900000e-01002 25.287466 156.162447 21578.288 97.234234 Done BBBBB 2 A10B1.120000e-01 1.200000e-01 1.300000e-01 1.400000e-01 1.500000e-01 1.600000e-01 1.700000e-01 1.800000e-01 1.900000e-012.120000e-01 2.200000e-01 2.300000e-01 2.400000e-01 2.500000e-01 2.600000e-01 2.700000e-01 2.800000e-01 2.900000e-013.120000e-01 3.200000e-01 3.300000e-01 3.400000e-01 3.500000e-01 3.600000e-01 3.700000e-01 3.800000e-01 3.900000e-014.120000e-01 4.200000e-01 4.300000e-01 4.400000e-01 4.500000e-01 4.600000e-01 4.700000e-01 4.800000e-01 4.900000e-015.120000e-01 5.200000e-01 5.300000e-01 5.400000e-01 5.500000e-01 5.600000e-01 5.700000e-01 5.800000e-01 5.900000e-016.120000e-01 6.200000e-01 6.300000e-01 6.400000e-01 6.500000e-01 6.600000e-01 6.700000e-01 6.800000e-01 6.900000e-017.120000e-01 7.200000e-01 7.300000e-01 7.400000e-01 7.500000e-01 7.600000e-01 7.700000e-01 7.800000e-01 7.900000e-018.120000e-01 8.200000e-01 8.300000e-01 8.400000e-01 8.500000e-01 8.600000e-01 8.700000e-01 8.800000e-01 8.900000e-019.120000e-01 9.200000e-01 9.300000e-01 9.400000e-01 9.500000e-01 9.600000e-01 9.700000e-01 9.800000e-01 9.900000e-01
My intent is to read all the data blocks (regardless of size) into cell arrays for future processing. Unfortunately, I’ve never tried reading in non-rectangular text files until now. I did some searching on the MATLAB WEB site for a sample of how this can be done. What I am trying to do is very similar to an example on the Mathworks site under examples for TEXTSCAN (Reading Arbitrary Format Text Files with TEXTSCAN). I ran the sample code with the test80211.txt file, and got the same results as the example. So I tried this approach with the data blocks shown above. But I keep getting the following error;
??? Error using ==> catCAT arguments dimensions are not consistent.Error in ==> cell2mat at 81 m{n} = cat(2,c{n,:});Error in ==> TestScript2 at 44 Data{Block,1}=cell2mat(InputText);
I began stepping through the code and noticed it ran good for the first data block, and the data variable had the 9×9 matrix I expected. However, I keep getting the above error on the 2nd block of data – at Data{Block,1}=cell2mat(InputText);
Here are the commands I’m utilizing (very similar to the example):
% Open text file
fid = fopen('BlockTypeTextTestFile.txt','r');% Read Header Row (Row 1) as a string delimited by a carriage return
InputText=textscan(fid,'%s',1,'delimiter','\n'); Headerline=InputText{1};disp(Headerline);% Initialize The Block Counter
Block = 1;% Initialize the s variable
s = '';% Lets read in each data block using TEXTSCAN for each line in the file.
while ~feof(fid) % Read row 2 delimited by a carriage return
InputText=textscan(fid,'%f %f %f %f %f %5.0s %5.0s %f %5.0s',1,'delimiter','\n'); % Initialize the Number Of Columns Counter
NumCols=InputText{1}; % Create format string based on block number
FormatString=repmat('%f %f %f %f %f %f %f %f %f',1,NumCols); % Read 9 X 9 matrix data (all 9 lines) delimited by a carriage return
InputText=textscan(fid,FormatString,9,'delimiter','\n'); % Convert to numerical array from cell
Data{Block,1}=cell2mat(InputText); %Size Of Table
NumRows,NumCols]=size(Data{Block}); % Increment the Block Counter
Block = Block+1; % Ensure we're at the end of the file
% where, isempty determines if next line is empty
% and strvcat will concatenate strings vertically.
% Close the figure after it appears as it is no longer needed.
if isempty(line),break, end s = strvcat(s,line);end% Dispaly contents of s to the screen
disp(s)%Close Text File
fclose(fid);
I’m stumped as to what could be causing the error. I've attempted to change the delimiter of the TEXTSCAN command to ' ' – no help.
The only thing I’ve noticed is that in the Workspace window, the FormatString value suddenly doubles from '%f %f %f %f %f %f %f %f %f' to '%f %f %f %f %f %f %f %f %f%f %f %f %f %f %f %f %f %f'
Any ideas as to what the source of the problem could be? Thank you. Brad.
Best Answer