MATLAB: Comparing columns of two sparse matrices

sparse matrix

Hello. I want to do the following… Input: two sparse matrices A,B. Output: sparse matrix C, with C(i,j)=1 exactly when the ith column of A is equal to the jth column of B. The quickest code I have so far is this one:
r=[];
c=[];
for i=1:size(A,2)
for j=1:size(B,2)
if isequal(A(:,i),B(:,j))
r=[r,i];
c=[c,j];
end
end
end
C=sparse(r,c,ones(1,length(r)));
But for big matrices it gets very slow. Is there a way to make it run faster?
cheers

Best Answer

Solution One: Simple
Here is a simple solution using ndgrid and arrayfun (but slow):
>> A = [1,1,0,0; 1,0,1,0]
A =
1 1 0 0
1 0 1 0
>> B = [0,1,0,1; 0,0,1,1]
B =
0 1 0 1
0 0 1 1
>> [X,Y] = ndgrid(1:size(A,2),1:size(B,2));
>> arrayfun(@(a,b)isequal(A(:,a),B(:,b)), X, Y)
ans =
0 0 0 1
0 1 0 0
0 0 1 0
1 0 0 0
Solution Two: Fastest
Using two nested loops is usually going to be much faster, as long as the output array is preallocated before the loops. Not preallocating arrays is the number one cause of beginners writing slow code. When arrays are not preallocated, such as in your code, on every loop iteration MATLAB has to check the array size and move the array to a new memory location if it has grown larger. This effect gets even more significant as the arrays get larger, as you have discovered
To solve this I simply preallocated the array before the loop, and so it does not grow inside the loops and the code is much much faster.
out = nan(size(A,2),size(B,2));
for a = 1:size(A,2)
for b = 1:size(B,2)
out(a,b) = isequal(A(:,a),B(:,b));
end
end
Do NOT expand arrays inside loops! Every time you write code like this, by concatenating values onto an array:
Vec = []
for ...
Vec = [Vec,new_value]
end
you are writing slow code. Learn to preallocate arrays!