MATLAB: Import data from multiple .dat files, remove headerlines, and read columns into array – but the number of headerlines differs across each .dat file

I have a large number of .dat files in a folder sorted under names in the format "author_energy_radiationtype_cellline", and I am using the "dir" command to select the files that apply to particular energies, cell lines etc. Each .dat file has between 2 and 7 headerlines I want to skip. The files have the following format.

"SF(Dose (Gy)) created by Plot Digitizer 2.6.8"
"Date: 1/17/19 8:41:06 AM"
author  year    mod   cell_line energy     let
****    2008  protons    ***     6MV      -$\mu$m
alpha alpha_err beta  beta_err alpha_X alpha_X_err beta_X beta_X_err
0.291 0.000   0.041  0.000   0.291 0.000   0.041  0.000 
Dose (Gy)    SF     Error
0.8312    0.7674    0.9121
1.8470    0.4560    0.5615
2.8600    0.2924    0.3457
4.8985    0.0761    0.1244
6.9425    0.0218    0.0344

I want to read the data under the 'Dose', 'SF', and 'Error' columns read into arrays, and I also need to extract the first and third values in the 6th row. Is there any way to do this when the number of headerlines changes from file to file?

This is my code so far. I can pick out the files with certain energies, etc. I can't seem to figure out how to actually extract the data in the way I described above.

% Specify the folder where files are located 
myFolder = 'C:\Users\..\Desktop\CellSurvivalData';
% Check to make sure that folder actually exists.  Warn user if it doesn't.
if ~isdir(myFolder)
  errorMessage = sprintf('Error: The following folder does not exist:\n%s', myFolder);
  uiwait(warndlg(errorMessage));
  return;
end
% Get a list of all files in the folder with the desired file name pattern.
filePattern = fullfile(myFolder, '*235MeV*HSG*'); % Define desired parameters
theFiles = dir(filePattern); % List the files which satisfy these parameters
for k = 1 : length(theFiles)
  baseFileName = theFiles(k).name;
  fullFileName = fullfile(myFolder, baseFileName);
  fprintf(1, 'Now reading %s\n', fullFileName);
  % Read the data from each file
  file{k} = readtable( fullFileName );
end

for k = 1 : length(theFiles) baseFileName = theFiles(k).name; fullFileName = fullfile(myFolder, baseFileName); fprintf(1, 'Now reading %s\n', fullFileName); % Read the data from each file fid = fopen(fullFileName); c = 0; store = false; while true l = fgetl(fid); if l == -1 % Break loop break end % Read Alpha and Beta if strcmp(l(1:5),'alpha') %If first 5 characters of line are alpha l = fgetl(fid); % get next line first_data = str2num(l); alpha_beta = [first_data(1) first_data(3)]; % store alpha and beta end % Read matrix of data below 'Dose' store = strcmp(l(1:4),'Dose'); while store l = fgetl(fid); c=c+1; if l == -1 % Exit and set store back to false store = false; break end data(c,:) = str2num(l); end end file{k} = data; clear data end

MATLAB: Import data from multiple .dat files, remove headerlines, and read columns into array – but the number of headerlines differs across each .dat file

Best Answer

Related Question

Best Answer

Related Solutions

MATLAB: Allowing for code to work different with file names

MATLAB: Code overwrites results in for loop. Cannot figure out how to index.

Related Question