MATLAB: How to multiply matrices using multiple GPU

bsxfungpuparfor

Hi All,
I'm new to Matlab, so apologies for any basic mistakes.
I'm trying to run a multiplication of matrices on multiple GPUs to then compare the computation time with running the same code on 1 GPU and again on the CPU. The machine has 5 GPUs, and the code is this:
matrixSize = 4000;
gpuDevice([]);
nGPUs = gpuDeviceCount();
parpool('local', nGPUs);
p = gcp;
spmd
gd = gpuDevice;
idx = gd.Index;
disp(['Using GPU ',num2str(idx)]);
end
% 5 GPUs
parfor i = 1:p.NumWorkers
gd = gpuDevice;
XGs{i} = rand(matrixSize,'gpuArray');
XGs_A{i} = XGs{i} * XGs{i};
XGs_B{i} = XGs{i} / XGs{i};
XGs_C{i} = @() bsxfun(@times, XGs_A{i}, XGs_B{i});
wait(gd);
end
time5GPUs = gputimeit(XG_C)
% 1 GPU
parfor i = 1:p.NumWorkers
XG{i} = rand(matrixSize,'gpuArray');
XG_A{i} = XG{i} * XG{i};
XG_B{i} = XG{i} / XG{i};
XG_C{i} = @() bsxfun(@times, XG_A{i}, XG_B{i});
end
time1GPU = gputimeit(XG_C)
% CPU
for i = 1:p.NumWorkers
X{i} = rand(matrixSize);
X_A{i} = X{i} * X{i};
X_B{i} = X{i} / X{i};
X_C{i} = @() bsxfun(@times, XG_A{i}, XG_B{i});
end
timeCPU = timeit(X_C)
When I run it, the error I get is
Error: The variable XGs_A in a parfor cannot be classified.
See Parallel for Loops in MATLAB, "Overview".
How can I solve this problem? And is there a better way to do this?

Best Answer

Well, for the start, XGs_C and XG_C are both Cell arrays. gputimeit() accepts a function handle not cell array.
Besides for the first gputimeit (line 20), you are accessing XG_C which is not defined until line 26. So I think you meant XGs_C.
also on line 34 (serial loop or "% CPU" section), you are accesing XG_A and XG_B which I think you wanted to access X_A and X_B.
Well, you can have your code working by changing it to the following:
matrixSize = 4000;
gpuDevice([]);
nGPUs = gpuDeviceCount();
% parpool('local', nGPUs);
p = gcp;
spmd
gd = gpuDevice;
idx = gd.Index;
disp(['Using GPU ',num2str(idx)]);
end
% 5 GPUs
parfor i = 1:p.NumWorkers
gd = gpuDevice;
XGs = rand(matrixSize,'gpuArray');
XGs_A = XGs * XGs;
XGs_B = XGs / XGs;
XGs_C = @() bsxfun(@times, XGs_A, XGs_B);
wait(gd);
time5GPUs{i} = gputimeit(XGs_C);
end
% 1 GPU
parfor i = 1:p.NumWorkers
XG = rand(matrixSize,'gpuArray');
XG_A = XG * XG;
XG_B = XG / XG;
XG_C = @() bsxfun(@times, XG_A, XG_B);
time1GPU{i} = gputimeit(XG_C);
end
% CPU
for i = 1:p.NumWorkers
X = rand(matrixSize);
X_A = X * X;
X_B = X / X;
X_C = @() bsxfun(@times, X_A, X_B);
timeCPU{i} = timeit(X_C);
end
Hope that solves your problem.