MATLAB: How to locate similar values in different vectors

locate similar values

a=[1 4 26 27 41 0 0 0 0 0 0 0;
2 22 23 41 0 0 0 0 0 0 0 0;
4 55 77 0 0 0 0 0 0 0 0 0;
3 6 88 91 0 0 0 0 0 0 0 0]
My matrix's size is 4000×4000, but for simplicity let's take a look at matrix a above. I look at each row as a vector of numbers (usually followed by zeros, which are irrelevant for the task at hand). If there are any (nonzero) values recurring in different rows, I want to concatenate those rows. For matrix a above, I would like to receive: a=[1 2 4 22 23 26 27 41 55 77 0 0; 3 6 88 91 0 0 0 0 0 0 0 0] (after using "unique" on the concatenated vectors). How do I locate recurring values in different rows without using for loops within for loops?

Best Answer

This is the way I would do the merging:
mergedsets = {};
for row = 1:size(a, 1)
arow = nonzeros(a(row, :))';
isinset = cellfun(@(set) any(ismember(arow, set)), mergedsets);
if isempty(isinset)
%none of the numbers in the row have been seen before
%create a new set
mergedsets = [mergedsets; arow];
else
%one or more sets contain some values in the row, merge all together
newset = unique([mergedsets{isinset}, arow]); %merge all sets with unique
mergedsets = [mergedsets(~isinset); newset]; %merge all sets (without the ones just merged) with new set
end
end
I don't believe you could do it without a loop. If you want a matrix afterward:
m = zeros(size(mergedsets, 1), max(cellfun(@numel, mergedsets)));
for row = 1:size(mergedsets, 1)
m(row, 1:numel(mergedsets{row})) = mergedsets{row};
end
Personally, I'd keep it as a cell array.