MATLAB: Flatten structure array if values are identical

flattenstructurestructure array

Dear matlab users,
I have a structure array, where the majority of fields have identical values. Some differ. The values can be numbers, strings, or cells. See minimal example below for a struct array of size 100, and fields A through F:
length(mystruct) = 100
%field A identical number
mystruct(1).A = 5
mystruct(2).A = 5
...
mystruct(100).A = 5
%field B identical string
mystruct(1).B = 'hello'
mystruct(2).B = 'hello'
..
mystruct(100).B = 'hello'
% field C not identical number
mystruct(1).C = 1
mystruct(2).C = 2
..
mystruct(100).C = 100
% field D not identical string
mystruct(1).D = 'x'
mystruct(2).D = 'y'
..
mystruct(100).D = 'z'
% field E identical cell
mystruct(1).E = {'a','b'}
mystruct(2).E = {'a','b'}
..
mystruct(100).E = {'a','b'}
% field F not identical cell
mystruct(1).F = {'a','b'}
mystruct(2).F = {'a','c','d'}
..
mystruct(100).F = {'b'}
I would like to "flatten" the common values of the structure array (which is actually very large with thousands of fields), and create a cell/vector for the non-common:
length(mystruct) = 1
mystruct.A = 5
mystruct.B = 'hello'
mystruct.C = [1,2,..,100] %this can also be a cell if easier
mystruct.D = {'x','y',..,'z'}
mystruct.E = {'a','b'}
mystruct.F = {{'a','b'}, {'a','c','d'}, .., {'b'}}
Is there a straight forward way to do this?
Many thanks,
Ida

Best Answer

You can use SERIALIZE de DESERIALIZE MATLAB objects before taking UNIQUE.
The out will be slightly different than yours, since anything will be stored in the cell (easy to fix but introduce exception treatment)
clear s
s(1).A = 5;
s(2).A = 5;
s(3).A = 5;
%field B identical string
s(1).B = 'hello';
s(2).B = 'hello';
s(3).B = 'hello';
% field C not identical number
s(1).C = 1;
s(2).C = 2;
s(3).C = 100;
% field D not identical string
s(1).D = 'x';
s(2).D = 'y';
s(3).D = 'z';
% field E identical cell
s(1).E = {'a','b'};
s(2).E = {'a','b'};
s(3).E = {'a','b'};
% field F not identical cell
s(1).F = {'a','b'};
s(2).F = {'a','c','d'};
s(3).F = {'b'};
clear flats
fieldnames(s);
for fname=fieldnames(s)'
flats.(fname{1}) = genunique({s.(fname{1})});
end
flats
%%
function c=genunique(c)
% https://www.mathworks.com/matlabcentral/fileexchange/34564-fast-serialize-deserialize
str=cellfun(@(x) char(hlp_serialize(x)'),c,'unif',0);
[~,i]=unique(str,'stable');
c=c(i);
end
Result:
A: {[5]}
B: {'hello'}
C: {[1] [2] [100]}
D: {'x' 'y' 'z'}
E: {{1×2 cell}} % {{'a','b'}}
F: {{1×2 cell} {1×3 cell} {1×1 cell}} % {{'a','b'}, {'a','c','d'} {'b'}}