MATLAB: Join several timetables with different variables and times and duplicates

I'm trying to read a bunch of csv files that each have one variable timeseries. Many files can hold the same variable and there are overlaps in the timestamps (duplicates).

My problem is when trying to merge these csv files, which are read to timetables, into a single timetable.

Example code:


times_TT_A = [datetime(2019,01,01) datetime(2019,01,02) datetime(2019,01,03)]';
data_TT_A = [1 2 3]';
TT_A = timetable(times_TT_A, data_TT_A,'VariableNames',{'Variable_ONE'});
times_TT_B = [datetime(2019,01,01) datetime(2019,01,02)]';
data_TT_B = [10 20]';
TT_B = timetable(times_TT_B, data_TT_B,'VariableNames',{'Variable_TWO'});
times_TT_C = [datetime(2019,01,02) datetime(2019,01,03)]';
data_TT_C = [20 30]';
TT_C = timetable(times_TT_C, data_TT_C,'VariableNames',{'Variable_TWO'});

So here there are two different variables and one variable, Variable_TWO, has overlap in timestamps.

I would like to have this result:

    times_TT_A    Variable_ONE    Variable_TWO    
    __________    ____________    _________________  
    2019-01-01         1                  10                 
    2019-01-02         2                  20                     
    2019-01-03         3                 30

I don't mind a solution with duplicate entries, I can sort those out later, for example this is also fine:

   times_TT_A    Variable_ONE    Variable_TWO    
    __________    ____________    _________________  
    2019-01-01         1                  10                 
    2019-01-02         2                  20                     
    2019-01-03         3                 NaN
    2019-01-02        NaN                 20
    2019-01-03        NaN                 30

The csv files are read in a loop and should then one by one be added to the common timetable like this in pseudocode

timetable_alldata = timetable()
for csvfile in csvfiles
    timetable_csv = readtable(csvfile)
    timetable_alldata = join(timetable_alldata, timetable_csv)
end

This is then the same as calling this using the example data – here using outerjoin:

outerjoin(outerjoin(TT_A,TT_B), TT_C)
ans =
  3×3 timetable
    times_TT_A    Variable_ONE    Variable_TWO_left    Variable_TWO_TT_C
    __________    ____________    _________________    _________________
    2019-01-01         1                  10                  NaN       
    2019-01-02         2                  20                   20       
    2019-01-03         3                 NaN                   30

I have tried stacking ([A; B]), vertcat, outerjoin and synchronize…but I seem to be stuck here.

Any suggestions how to solve this?

% Make the header of row times of input timetables to be the same TT_A.Properties.DimensionNames{1} = 'Time'; TT_B.Properties.DimensionNames{1} = 'Time'; TT_C.Properties.DimensionNames{1} = 'Time'; % call outerjoin and merge keys TT_AB = outerjoin(TT_A,TT_B, ... 'Keys',['Time' intersect(TT_A.Properties.VariableNames,TT_B.Properties.VariableNames)], ... 'MergeKeys',true) TT_ABC = outerjoin(TT_AB,TT_C,... 'Keys',['Time' intersect(TT_AB.Properties.VariableNames,TT_C.Properties.VariableNames)], ... 'MergeKeys',true) TT_ABC = 4x2 timetable Time Variable_ONE Variable_TWO ____________________ ____________ ____________ 01-Jan-2019 00:00:00 1 10 02-Jan-2019 00:00:00 2 20 03-Jan-2019 00:00:00 NaN 30 03-Jan-2019 00:00:00 3 NaN

TTcell = {TT_A TT_B TT_C}; TT = timetable(); for i = 1:length(TTcell) TT_i = TTcell{i}; TT(TT_i.Time,TT_i.Properties.VariableNames) = TT_i; end TT = 3x2 timetable Time Variable_ONE Variable_TWO ___________ ____________ ____________ 01-Jan-2019 1 10 02-Jan-2019 2 20 03-Jan-2019 3 30

Best Answer

Hi,

I came up with two solutions.

Solution 1: use outerjoint. It is easy to understand but have duplicated rows that need to be merged.

Solution 2: Make use of assignment. This makes the strong assumption that the duplicate rows are really duplicates. So no need to post-process the result.

Best Answer

Related Solutions

MATLAB: Add zero time in timeseries date time column

MATLAB: Fill a timetable outside the gaps

Related Question