MATLAB: Import data from multiple .dat files, remove headerlines, and read columns into array – but the number of headerlines differs across each .dat file

data importMATLAB

I have a large number of .dat files in a folder sorted under names in the format "author_energy_radiationtype_cellline", and I am using the "dir" command to select the files that apply to particular energies, cell lines etc. Each .dat file has between 2 and 7 headerlines I want to skip. The files have the following format.
"SF(Dose (Gy)) created by Plot Digitizer 2.6.8"
"Date: 1/17/19 8:41:06 AM"
author year mod cell_line energy let
**** 2008 protons *** 6MV -$\mu$m
alpha alpha_err beta beta_err alpha_X alpha_X_err beta_X beta_X_err
0.291 0.000 0.041 0.000 0.291 0.000 0.041 0.000
Dose (Gy) SF Error
0.8312 0.7674 0.9121
1.8470 0.4560 0.5615
2.8600 0.2924 0.3457
4.8985 0.0761 0.1244
6.9425 0.0218 0.0344
I want to read the data under the 'Dose', 'SF', and 'Error' columns read into arrays, and I also need to extract the first and third values in the 6th row. Is there any way to do this when the number of headerlines changes from file to file?
This is my code so far. I can pick out the files with certain energies, etc. I can't seem to figure out how to actually extract the data in the way I described above.
% Specify the folder where files are located
myFolder = 'C:\Users\..\Desktop\CellSurvivalData';
% Check to make sure that folder actually exists. Warn user if it doesn't.
if ~isdir(myFolder)
errorMessage = sprintf('Error: The following folder does not exist:\n%s', myFolder);
uiwait(warndlg(errorMessage));
return;
end
% Get a list of all files in the folder with the desired file name pattern.
filePattern = fullfile(myFolder, '*235MeV*HSG*'); % Define desired parameters
theFiles = dir(filePattern); % List the files which satisfy these parameters
for k = 1 : length(theFiles)
baseFileName = theFiles(k).name;
fullFileName = fullfile(myFolder, baseFileName);
fprintf(1, 'Now reading %s\n', fullFileName);
% Read the data from each file
file{k} = readtable( fullFileName );
end

Best Answer

If 'Dose' appears as the first four elements of this line only, and similarly 'alpha' as the first five of the other line, you could do the following:
for k = 1 : length(theFiles)
baseFileName = theFiles(k).name;
fullFileName = fullfile(myFolder, baseFileName);
fprintf(1, 'Now reading %s\n', fullFileName);
% Read the data from each file
fid = fopen(fullFileName);
c = 0; store = false;
while true
l = fgetl(fid);
if l == -1 % Break loop
break
end
% Read Alpha and Beta
if strcmp(l(1:5),'alpha') %If first 5 characters of line are alpha
l = fgetl(fid); % get next line
first_data = str2num(l);
alpha_beta = [first_data(1) first_data(3)]; % store alpha and beta
end
% Read matrix of data below 'Dose'
store = strcmp(l(1:4),'Dose');
while store
l = fgetl(fid);
c=c+1;
if l == -1 % Exit and set store back to false
store = false;
break
end
data(c,:) = str2num(l);
end
end
file{k} = data;
clear data
end