MATLAB: Faster way to match up dates with data

I wrote some code that matches up observational data (data recorded every 10 minutes for 15 years) that is occasionally missing a timestep, or multiple timesteps, with a complete, ideal time series. The problem is that it takes way too long, and I'm betting there's a better way to do this that I just haven't thought of.

Here's the code:

%Creates an empty array that will be filled in with observed meteorological data
%values
ideal_obs = zeros(766945,3);
%For loop that fills in meteorological data for every timestep in the time series
for m = 1:766945 %This m iterates over every row of date_time_index
    for r = 1:size(time_obs,1) % r iterates over 'n' rows of observed time data
        if isequal(date_time_index(m,1),time_obs(r,1)) == 1
            ideal_obs(m,:) = raw_data(r,1:3);
            break
        elseif isequal(date_time_index(m,1),time_obs(r,1)) == 0
            if r == size(time_obs,1)               
                ideal_obs(m,1:3) = NaN;
            end
            continue
        else
            disp('Derp')
        end
      end
  end

The idea is that my program will iterate through my ideal time series, and will attempt to import data from my observational dataset for every time step (every 10-minute period) in the idealized time series.

For missing observations, NaN is used. This way, I have a complete time series and can see where I have missing data, and can later decide how to deal with it.

The code works. I ran it on a small subset of my data. The problem is, I tried running it on my complete dataset, and it ran all night and hasn't finished yet. I'm hoping someone might be able to suggest a faster way to accomplish this task. In Excel I was using Vlookup() and it finished pretty quickly…

Best Answer

Well, yes your code is extremely inefficient.

ideal_obs = zeros(numel(date_time_index), 3); 
[isfound, where] = ismember(date_time_index, time_obs);
ideal_obs(isfound, :) = raw_data(where(isfound), 1:3);

will do the same much faster.

Notes:

I've assumed that date_time_index and time_obs are both vectors despite your usage of 2d indexing
if somecond == 1... elseif somecond == 0... is more simply written as if somecond ... else ... since you already know that somecond is 0 if it's not 1.
Have you thought about using timetables? You could have used synchronise or retime to let matlab do the filling for you in one line.

Best Answer

Related Solutions

MATLAB: Find missing dates in dataset and make observations NaN

MATLAB: How calculate daily, monthly, seasonally mean average and std

Related Question