MATLAB: Unique() on a Nx1 cell array with different length vectors per cell

cell array unique rows vectors duplicateMATLAB

Hello,
I am very sorry if this was already solved, but a quick search in the answers yielded nothing.
I have an Nx1 cell array with cells containing a vector each with varying length. Some of these vectors are duplicates, which I would like to remove.
Example – From this:
[11 12 23 24 13 14 21 22]
[11 12 23 24 13 14 21 32 31 22]
1x12 double
[11 12 23 24 13 14 21 22]
[11 12 23 24 13 14 21 32 31 22]
[23 24 33 34]
[21 22 31 32]
where cell 3,1 and 4,1 are copies of cell 1,1 and 2,1
I want to get this:
[11 12 23 24 13 14 21 22]
[11 12 23 24 13 14 21 32 31 22]
1x12 double
[23 24 33 34]
[21 22 31 32].
How to do this in a performant way? unique(mycell, 'rows') would not work on cells as far as I know.
thanks & best wishes, Florian

Best Answer

You can create your own function to find the unique cells:
function uC = UniqueCell(C)
% Reply cell array with unique elements, independent of the type
% and size of the cell elements.
% Author: Jan Simon, Heidelberg, License: CC BY-SA 3.0
u = true(size(C));
for iC = 1:numel(C)
if u(iC)
aC = C{iC};
for jC = iC + 1:numel(C)
u(jC) = u(jC) && ~isequal(aC, C{jC});
end
end
end
uC = C(u);
end
Call it like:
uniqC = UniqueCell(C)
If C is much larger, about several dozens of cells, this linear searching is less efficient. A sorting of cells is not trivial, but you can sort the hash values - see FEX: GetMD5 (C-compiler required):
function uC = UniqueCell2(C)
nC = numel(C);
H = cell(nC, 1);
for iC = 1:nC
H{iC} = GetMD5(C{iC}, 'Array', 'base64');
end
% [~, u] = unique(H); % Inlined and much leaner UNIQUE:
[Hs, S] = sort(H);
u(S) = [true; ~strcmp(Hs(2:nC), Hs(1:nC - 1))];
uC = C(u);
end
...
Opt.Method = 'MD5';
Opt.Format = 'base64';
Opt.Input = 'array';
for iC = 1:nC
H{iC} = DataHash(C{iC}, Opt);
end
...