MATLAB: Duplicate x,y in table; get min/max of variables to new table

table

I have a large dataset, small subset attached. There are duplicate x,y values. I need a cleaned table with "var1_max, var1_min, var2_max, var2_min" etc. for every duplicate "x,y"

I'm just not sure where to start setting up the problem. Thanks for any pointers to get going. Using Matlab R2018a, no fancy add-ons/packages

file_in = 'example.csv';
data_in = readtable(file_in);
%%table_out.Properties.VariableNames = {'id', 'x', 'y', 'var1_max', 'var1_min', ...
%%	'var2_max', 'var2_min', 'var3_max', 'var3_min', 'var4_max', 'var4_min', ...
%%	'var5_max', 'var5_min', 'var6_max', 'var6_min', 'var7_max', 'var7_min'};

Best Answer

One potential solution that is simple to implement is to use groupsummary. You can have your data grouped by x and y, and have it return the within-group min, mean, and max for table variables you specify.

However, you prescribe wanting NaN for mid-value if there are min/max values. This won't do that .It will provide values for all 3 statistics. Also note that group summary will only return one row for each group. You can use the join function if you want to merge the two together.

newData = groupsummary(data_in,{'x','y'},{'min','mean','max'},["var1","var2","var3","var4","var5","var6","var7"])

Related Solutions

MATLAB: How to get average/max/min table of many tables

Have you tried the mean() and max() functions to see if they work with a table. I haven't though they probably won't since tables can contain non-numeric data.

The other option is to just extract each column into an array, and then you can use whatever function you want. Like for 3 tables:

col11 = table1{:, 1};
col21 = table2{:, 1};
col31 = table3{:, 1};
meanCol = mean([col11, col21, col31]);
maxCol = max([col11, col21, col31]);

If you have a lot of them, you can put them into a loop where you read each table from a file, and extract each column one at a time into a 2-D array and then do the math. See The FAQ

Attach your data in .mat files with the paper clip icon if you need more help.

MATLAB: Correlation between two row matrices

Like that, each value of "a" is correlated to each value of "b", but applying the formula of the correlation, the correlation of two single numbers is NaN. To compute the correlation correctly, traspose the input vectors

result  = corr(a', b');

Best Answer

Related Solutions

MATLAB: How to get average/max/min table of many tables

MATLAB: Correlation between two row matrices

Related Question