MATLAB: How to average multiple arrays of doubles from a row of a cell array as long as a criterium holds in that row

arrayaveragecell arraycell array doubles criteriumcriteriumdoubles

I had struggles writing down the question in a right way, but I hope that the point is clear. From measurements that we have performed we created a large data set containing measurement parameters and also the measured values over a large wavelength range. At most locations, multiple measurements were performed which I want to average.
The data set is already filtered for specific parameters, and only this last step is yet necessary. Because of the different types of parameters that were used, all data is stored in a cell array A.
A = {
84 1 1 1 8 1 1 [1x200 double] [1x200 double] [1x200 double]
84 1 2 1 8 1 1 [1x200 double] [1x200 double] [1x200 double]
84 1 3 1 8 1 1 [1x200 double] [1x200 double] [1x200 double]
84 1 4 2 8 1 1 [1x200 double] [1x200 double] [1x200 double]
84 1 5 2 8 1 1 [1x200 double] [1x200 double] [1x200 double]
84 1 6 2 8 1 1 [1x200 double] [1x200 double] [1x200 double]
}
The first three columns A(:,1:3) are ID parameters that are not necessary and then there is A(:,4) which is the location ID. Columns A(:,5:7) are not interesting either. Column 8:end are our measurement arrays. The point is that I want to be able to identify which rows have the same location ID A(:,4) and then average the corresponding arrays of doubles.
Steps:
1. Identify separate locations
A_1 = {
84 1 1 1 8 1 1 [1x200 double] [1x200 double] [1x200 double]
84 1 2 1 8 1 1 [1x200 double] [1x200 double] [1x200 double]
84 1 3 1 8 1 1 [1x200 double] [1x200 double] [1x200 double]
}
A_2 = {
84 1 4 2 8 1 1 [1x200 double] [1x200 double] [1x200 double]
84 1 5 2 8 1 1 [1x200 double] [1x200 double] [1x200 double]
84 1 6 2 8 1 1 [1x200 double] [1x200 double] [1x200 double]
}
2. Average these locations
A_1_average = {
[] [] [] 1 8 1 1 [1x200 double] [1x200 double] [1x200 double]
}
A_2_average = {
[] [] [] 2 8 1 1 [1x200 double] [1x200 double] [1x200 double]
}
3. Combine averaged location in a new set
A_average = {
[] [] [] 1 8 1 1 [1x200 double] [1x200 double] [1x200 double]
[] [] [] 2 8 1 1 [1x200 double] [1x200 double] [1x200 double]
}
I wrote [ ] as these parameters are not needed anymore and don't have to be averaged. Additionally, not all measurements have been performed strictly three times per location (varies between 1-4 measurements per location).
It might be very easy, but I find myself struggling to find the best first steps.

Best Answer

foo = @()randi(9,1,5); % for testing, 1x5 vectors.
A = {
84,1,1,1,8,1,1,foo(),foo(),foo();
84,1,2,1,8,1,1,foo(),foo(),foo();
84,1,3,1,8,1,1,foo(),foo(),foo();
84,1,4,2,8,1,1,foo(),foo(),foo();
84,1,5,2,8,1,1,foo(),foo(),foo();
84,1,6,2,8,1,1,foo(),foo(),foo();
};
% Sort by the fourth column:
[V,X] = sort([A{:,4}]);
B = A(X,:);
% Use column four to identify groups:
Y = logical([1,diff(V)]);
D = diff([find(Y),1+numel(V)]);
% Mean each of the last three columns:
S = cellfun('size',B(:,8:end),2);
assert(all(all(diff(S)==0)),'Vectors in one column must have the same length');
C = mat2cell(cell2mat(B(:,8:end)),D,S(1,:));
C = cellfun(@mean,C,'UniformOutput',false);
% Create output cell array:
Z = [B(Y,1:7),C];
Here is a comparison of some input and output values, showing that is has successfully taken the mean of the three vectors in the first three rows in the ninth column:
>> A{1:3,9}
ans =
8 8 5 2 6
ans =
9 4 3 1 5
ans =
4 8 6 5 6
>> Z{1,9}
ans =
7.0000 6.6667 4.6667 2.6667 5.6667
>> mean(vertcat(A{1:3,9})) % by hand
ans =
7.0000 6.6667 4.6667 2.6667 5.6667