This is in no way guaranteed to give you a perfectly balanced output. After that you need to look at optimisation algorithm which is beyond my field. This is also a lot less efficient than my other solution:
C = {[1 2 3 4 5 6], [1 4 5 6 7], [1 5 8 9 12 20]}
[uvals, ~, bin] = unique(cell2mat(C));
origarray = repelem(1:numel(C), cellfun(@numel, C));
dist = table(uvals', accumarray(bin, 1), accumarray(bin, origarray, [], @(x) {x}), 'VariableNames', {'value', 'repetition', 'arrayindices'});
dist = sortrows(dist, 'repetition');
The above builds a table of all the unique values, how many times they're repeated and where they come from originally. You can then iterate through that to distribute the values:
newC = accumarray(cell2mat(dist.arrayindices(dist.repetition == 1)), ...
dist.value(dist.repetition == 1), [], @(x){x'});
dist(dist.repetition == 1, :) = [];
for row = 1:height(dist)
destarrays = dist.arrayindices{row};
[~, destidx] = min(cellfun(@numel, newC(destarrays)));
newC{destarrays(destidx)} = [newC{destarrays(destidx)}, dist.value(row)];
end
celldisp(newC)
I'm using a table here, which is not the most efficient container in matlab, but that make it easier to understand the code.
Best Answer