Here is some code that reads in the complete file (including the header strings), identifies the step size, locates any missing measurements, and then creates a numeric array with the missing measurements replaced with NaN's. It uses the following functions:
- both fgetl and textscan to read the file data
- regexp to split the header strings
- datenum to convert the date-strings to serial date numbers.
- diff to calculate all timesteps, and then mode to get the likely sampling timestep.
- ismember to locate the positions of the non-missing measurements.
Note that this is fully vectorized code, which is much faster and more efficient than using loops to solve this kind of problem.
fid = fopen('1201b.txt','rt');
str = fgetl(fid);
hdr = regexp(str,'(?<=")\w+(?="(\s|$))','match');
fmt = ['%q',repmat('%f',1,numel(hdr)-1)];
C = textscan(fid,fmt,'CollectOutput',true);
fclose(fid);
fkr = 24*60*60;
dtn = round(fkr * datenum(C{1},'yyyy-mm-dd HH:MM:SS'));
stp = mode(diff(dtn));
dtv = dtn(1):stp:dtn(end);
out = nan(numel(dtv),numel(hdr)-1);
idx = ismember(dtv,dtn);
out(idx,:) = C{2};
And we can check the first ten lines in the command window:
>> out(1:10,:)
ans =
276 279 19.00 20.34 20.02 19.97 19.54 18.95 18.50 17.97
274 277 19.39 20.19 19.57 19.76 19.69 19.20 17.99 17.67
279 280 19.39 19.69 19.97 19.86 19.89 19.35 17.84 17.72
277 277 19.30 20.19 19.62 20.12 20.25 19.96 18.65 18.07
278 278 19.39 19.99 19.57 20.37 20.45 20.61 17.94 17.72
278 279 19.00 19.84 19.57 20.07 20.35 19.76 16.98 17.62
271 277 18.52 18.78 19.52 20.27 20.70 19.60 17.99 17.32
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
267 280 19.78 18.28 18.37 18.86 19.29 18.09 16.33 15.57
Note that method uses mode to identify the timestep. This works as long as the majority of timesteps are correct... if there are too many missing measurements, this method may not work properly and you will have to identify the timestep size in some other way.
Best Answer