MATLAB: Conditional textscan – How to select certain lines from a file

conditional textscanregex

Hi there, I would like to read information from a file into an array for later use. Only certain rows of that file are supposed to be read in, namely rows for which the second column starts with 'S1' and is followed by two random digits. I'm having trouble with this conditional textscan. Here is the code for reading in the file (note that it starts with 13 lines that are not in column format, hence the "headline" codes at the beginning). I basically want the varibales Postion, Length, Channel etc only to be read in for lines that meet the regex condition.
dataFileName=strcat('EEG_Anne_',int2str(pNumber),'.vmrk');
fid = fopen(dataFileName);
headline1=fgets(fid);
headline2=fgets(fid);
headline3=fgets(fid);
headline4=fgets(fid);
headline5=fgets(fid);
headline6=fgets(fid);
headline7=fgets(fid);
headline8=fgets(fid);
headline9=fgets(fid);
headline10=fgets(fid);
headline11=fgets(fid);
headline12=fgets(fid);
headline13=fgets(fid);
C = textscan(fid, '%s%s%d%d%d','Delimiter',',');
Stimulus=C{2};
if regexp(Stimulus{i},'S1\d*'),
Type=C{1};
Position=C{3};
Length=C{4};
Channel=C{5};
end
fclose(fid);

Best Answer

Usually the fastest and easiest way to select from a dataset is to read the complete file into MATLAB and then make the selection inside of MATLAB:
N = 117;
fileName = sprintf('EEG_Anne_%d.vmrk',N);
fid = fopen(fileName);
hdrRows = 13;
hdrData = textscan(fid,'%s',hdrRows, 'Delimiter','\n');
matData = textscan(fid,'%s%s%s%d%d%d', 'Delimiter',{',','='}, 'CollectOutput',true);
fclose(fid);
X = ~cellfun('isempty',regexp(matData{1}(:,3),'^S1\d\d$','once'));
To read the header data into a cell array I also replaced the very awkward 17 calls to fgets with one simple call to textcscan. To test this code I used the file that you gave in your other answer (attached here also). The test detects these rows:
>> matData{2}(X,:)
ans =
13127 1 0
17828 1 0
22387 1 0
27429 1 0
31951 1 0
36610 1 0
51258 1 0
56417 1 0
61951 1 0
.... etc
which corresponds exactly to the rows with 'S1xx' in the second column.
Bonus: if you want to practice using regular expressions (i.e. regexp), then you can try my FEX submission:
This tool lets you interactively write and change a regular expression, and updates the outputs as you type, so you can see what effect those changes have on the string parsing. It is a great way to practice using regular expressions, or to adapt a regular expression to your particular requirements.