MATLAB: Importing line-by-line from a text file

text filetextscan

Hi I have a problem choosing what kind of tool to import data from a text file. The data looks something like this although the lines are like 4000 and
35, 0401-BA1,1,6,8,00000,0,01,053,50530,'Channel ',,
35, 0401-CB1,1,6,8,00000,0,01,051,50510,'Spillway ',,
35, 0401-EB1,1,3,2,00052,0,01,051,50510,'RioGrande ',,
35, 0401-FS1,1,6,8,00203,0,01,011,27340,'Spillway ',,
35, 0401-MB1,1,6,8,00000,0,01,037,0 ,'Canal ',,
I used the fscanf , textscan but I don't understand how to tackle the spaces. for the second space is it like %15c ?? Also How do I switch to another line should I use /n? Please help me here guys. Thank you

Best Answer

Hint
str = '35, 0401-BA1,1,6,8,00000,0,01,053,50530,''Channel '',,';
fmt = '%d%s%d%d%d%d%d%d%d%d%s%s';
cac = textscan( str, fmt ...
, 'Delimiter' , ',' ...
, 'CollectOutput', true );
  • " how to tackle the spaces" &nbsp depends on how you want the spaces in the resulting string variable values
  • "line-by-line" &nbsp what exactly do you mean?
&nbsp
Addendum I
This illustrates an approach. However, due to various mismatches between the header row and the data rows the result is not useful.
sas = cssm()
returns
1x4000 struct array with fields:
STATE_CODE_001
STRUCTURE_NUMBER_008
RECORD_TYPE_005A
ROUTE_PREFIX_005B
SERVICE_LEVEL_005C
ROUTE_NUMBER_005D
DIRECTION_005E
HIGHWAY_DISTRICT_002
COUNTY_CODE_003
PLACE_CODE_004
....
where
function sas = cssm
ffs = 'c:\m\cssm\NM14.txt';
fid = fopen( ffs );
str = fgetl( fid );
col_head_list = parse_header( str );
sas = repmat( cell2struct( repmat({''},1,112), col_head_list(1:112), 2 ), 1,4000 );
jj = 0;
while not( feof(fid) )
str = fgetl( fid );
buf = parse_data_row( str );
jj = jj + 1;
sas(1,jj) = cell2struct( buf(1:112), col_head_list(1:112), 2 );
end
fclose( fid );
end
function cac = parse_header( str )
cac = strsplit( str, ',' );
cac = strtrim( cac );
end
function cac = parse_data_row( str )
cac = strsplit( str, ',' );
cac = strtrim( cac );
end
&nbsp
Addendum II
I was fooled by the function, strsplit, which as default treats multiple delimiters as one. In my world that only applies to space. Anyhow, now I have fixed the function cssm.
Read the data to a structure
>> sas=cssm;
find all bridges, which have STATUS_WITH_10YR_RULE equal to '2'.
>> is_sw10yrEq2 = strcmp('2',{sas.STATUS_WITH_10YR_RULE});
>> sum(is_sw10yrEq2)
ans =
324
there are 324 of them. List their building year
>> {sas(is_sw10yrEq2).YEAR_BUILT_027}
ans =
Columns 1 through 9
'1915' '1936' '1928' '1931' '1920' '1951' '1933' '1939' '1939'
Columns 10 through 18
'1924' '1931' '1931' '1931' '1932' '1932' '1936' '1937' '1936'
....
where
function sas = cssm
ffs = 'c:\m\cssm\NM14.txt';
fid = fopen( ffs );
str = fgetl( fid );
colhead_list = parse_header( str );
n_colhead = length( colhead_list );
sas = repmat( cell2struct ...
( repmat({''},1,n_colhead), colhead_list, 2 ) ...
, 1,4000 );
jj = 0;
while not( feof(fid) )
str = fgetl( fid );
buf = parse_data_row( str );
jj = jj + 1;
sas(1,jj) = cell2struct( buf, colhead_list, 2 );
end
fclose( fid );
sas(1,jj+1:end) = [];
end
function cac = parse_header( str )
cac = strsplit( str, ',', 'CollapseDelimiters',false );
cac = strtrim( cac );
end
function cac = parse_data_row( str )
cac = strsplit( str, ',', 'CollapseDelimiters',false );
cac = strtrim( cac );
end
&nbsp
Addendum III
Caveat: I hardly used the new (couple of years old), data type table. Nevertheless, table is targeted at problems like yours and I think you should
  • use Import Data interactively to read from a file of format B (one header line, comma separated, varying width) to a table. Make sure to uncheck Multiple Delimiters as One and select table. I've tested; it worked fine.
  • spend time with the documentation of table instead of fighting file formats.
&nbsp
Addendum IV
It is straightforward to read the file of format B (one header line, comma separated, varying width) with textscan
>> cac = cssm
cac =
{3951x134 cell}
where
function cac = cssm
ffs = 'c:\m\cssm\NM14.txt'; % one header line, comma separated, varying width
fid = fopen( ffs );
str = fgetl( fid );
chl = strtrim( strsplit( str, ',', 'CollapseDelimiters',false ) );
len = length( chl );
frm = repmat( '%s', 1, len );
cac = textscan( fid, frm, 'CollectOutput',true );
fclose( fid );
end