MATLAB: Nchoose2: save output in chunks

#nchooseknchoose2; cell array;

Hi everyone, I have a cell array N(m,n) with mixed numeric/ string (with the first row as a header). I would like to create combinations without repetition of every row i with every other row j ≠ i. I am doing this with user-written nchoose2.
ind = nchoose2(1:size(N, 1)-1);
Unfortunately, my cell array is too large so that ind generates an out-of-memory error. Can I save the output of nchoose2 (or I wouldn't mind using nchoosek) in chunks? Like save the first 50k rows of the ind, process them, delete them, and then turn to the next 50k?

Best Answer

Neither nchoosek nor nchoose2 let you return a portion of the output.
You can always generate the output using a loop and break out whenever you want:
function [rowcombination, nextfirstrow, nextsecondrow] = choose2row(in, maxrows, startfirstrow, startsecondrow)
%CHOOSE2ROW create every combination of 2 rows of a matrix/cell array
%The function can return a portion of the output and be called again to return the next portion.
%The function uses double loops to compute all combinations.
%Outputs:
% rowcombination: matrix/cell array where each row is the concatenation of two distinct rows of the original matrix/cell array.
% nextfirstrow:
% nextsecondrow: parameters to pass back to a subsequent call to CHOOSE2ROW to return the next portion of row combination.
%Inputs:
% in: input matrix/cell array of size [m, n].
% maxrows: maximum number of rows of output rowcombination. Inf for no limit. Scalar, optional. default Inf.
% startfirstrow: outer loop start index. Scalar, optional. default 1.
% startsecondrow: inner loop start index. Scalar, optional. default startfirstrow - 1.
if nargin < 2 || maxrows == Inf
maxrows = Inf;
else
validateattributes(maxrows, {'numeric'}, {'scalar', 'positive', 'integer'}, 2);
end
if nargin < 3
startfirstrow = 1;
else
validateattributes(startfirstrow, {'numeric'}, {'scalar', 'positive', 'integer', '<', size(in, 1)}, 3);
end
if nargin < 4
startsecondrow = startfirstrow + 1;
else
validateattributes(startsecondrow, {'numeric'}, {'scalar', 'positive', 'integer', '<=', size(in, 1), '>', startfirstrow}, 4);
end
nrows = (size(in, 1) - startfirstrow + 1) * (size(in, 1) - startfirstrow) / 2 - (startsecondrow - startfirstrow - 1); %total size of output still to generate
rowcombination = repmat(in(1, :), min(nrows, maxrows), 2); %initialise output to required size
rowout = 1;
for nextfirstrow = startfirstrow : size(in, 1)-1
for nextsecondrow = startsecondrow : size(in, 1)
rowcombination(rowout, :) = [in(nextfirstrow, :), in(nextsecondrow, :)];
rowout = rowout + 1;
if rowout > maxrows
nextsecondrow = nextsecondrow + 1; %#ok<FXSET> exiting the loop
if nextsecondrow > size(in, 1)
nextfirstrow = nextfirstrow + 1; %#ok<FXSET>



nextsecondrow = nextfirstrow + 1; %#ok<FXSET>
if nextfirstrow == size(in, 1)
nextfirstrow = Inf; %#ok<FXSET>
nextsecondrow = Inf; %#ok<FXSET>
end
end
return
end
end
startsecondrow = nextfirstrow + 2;
end
nextfirstrow = Inf;
nextsecondrow = Inf;
end
Of course, you're trying performance for memory.