MATLAB: I need to create a loop in which it skips data sets that have a better version of them. e.i if their is an M1 that has an M2 it only reads the M2

help

I have to graph massive amounts of csv files but only the newest types of them. A lot of the files have M2, M3, and M4 versions and I only want the newest kind. Is there any way to rid of all the M1-M3 that have better versions?

Best Answer

Use dir() to get the filenames. Then get a new list of filenames where you chop off the last number (assuming they go up only to 9, not to 10 and beyond). Then use ismember to see if the filename occurs twice or more. If it does, get the files, using indexes that ismember tells you, and find out which one has the biggest number. Keep any that occur only once, or if twice, keep just the largest number. Keep these in an output list.

% fileInfo = dir('*.dat');
% fileNames = {fileInfo.name}
% if isempty(fileNames)
%   uiwait(errordlg('No files found'));
%   return;
% end
% Make up sample data for testing.
fileNames = {'file1_m1.dat', 'file1_m2.dat', 'file2_m1.dat', 'file3_m1.dat', 'file4_m1.dat'}
% Create array for filenames without the final character in the base file name.
noVersions = cell(1, length(fileNames));
for k = 1 :length(fileNames)
  % Get base file name without last character.
  [~, thisString, ext] = fileparts(fileNames{k});
  noVersions{k} = thisString(1:end-1);  
end
celldisp(noVersions);
% See if any string is in there more than twice.
uniqueStrings = cell(length(fileNames), 1);
numUnique = 0; % Keep track of how many files we collect so we can truncate the array afterwards.
for k = 1 :length(fileNames)
  thisString = noVersions{k};
  fprintf('Checking for multiple occurrences of %s...\n', thisString);
  [ia, ib] = ismember(thisString, noVersions)
  if ib ~= k
    % This string occurs earlier than element k
    % Overwrite the first occurrence of it with this later version number.
    uniqueStrings{ib} = fileNames{k};
  else
    % This is the first time it appears.  Add it to the list.
    uniqueStrings{k} = fileNames{k};
    numUnique = numUnique + 1;
  end
end
celldisp(uniqueStrings);
% Find out which cells are empty.
emptyCells = find(cellfun(@isempty, uniqueStrings))
% Remove those empty ones to get the final list.
uniqueStrings(emptyCells) = []

The above intuitive brute force method works, though if you wait, I'm sure Andrei will give you a cryptic one-liner (probably using cellfun()) that will do the same thing.

Related Solutions

MATLAB: How to combine multiple .dat files.

Just call csvread or dlmread 5 times, then concatenate, then call csvwrite

m1 = csvread(filename1);
m2 = csvread(filename2);
m3 = csvread(filename3);
m4 = csvread(filename4);
m5 = csvread(filename5);
mOut = [m1;m2;m3;m4;m5];
csvwrite(fileNameOut, mOut);

MATLAB: Fopen error too many argument

If the file is returned by dir then it does exist unless you're trying to process the directory entries as well as the files.

That's why I suggested using a wildcard in the filename so that only files (not directories) will be included.

If there isn't any filename wildcard that would work (seems unlikely) then use the .isdir field to skip non-files...

d=dir(dirpath);  
for k=1:length(d)
  if d(k).isdir, continue, end       % skip any that are directory entries

  fid=fopen(d(k).name,'rb','ieee-be')
  data=fread(fid, inf, 'uint32');    %  read the full file
  fid=fclose(fid);                   % close the file when done
  % do whatever need to do w/ the data here
  ...

ADDENDUM

d=dir(dirpath);  
for k=1:length(d)
  if d(k).isdir, continue, end       % skip any that are directory entries
  ...

Or, of course, if you need to traverse subdirectories as well, you can do that inside this loop by nesting or rearrange the order to ensure process all pertinent subdirectories in order first.

Best Answer

Related Solutions

MATLAB: How to combine multiple .dat files.

MATLAB: Fopen error too many argument

Related Question