Hi,
I am trying to execute a binary image dilation on the GPU, for a large image, and I need to repeat this operation several thousands of times, and save the results. I am not sure why it takes significantly longer (about 10x worse) to do that. Below is the sample code I executed, followed by the results obtained. Please advise if this is the correct behavior, and why. Also, if I only need to save the result immediately, do I need to "gather" it before saving it?
My machine is a dual-CPU, 10-core each, with 128GB RAM. My GPU is a Tesla K20c. I am using MATLAB 2013A.
Thanks.
% Sample code.
% ------------
useGPU = 1; if useGPU gpu = gpuDevice(1); reset(gpu); end boston = geotiffread('boston.tif'); edgeImage = edge(rgb2gray(boston), 'canny'); % Creating a large image.
bwImage = repmat(edgeImage, 3, 2); for i = 1 : 10 % Create mockup binary data.
mask = rand(501) > 0.99; if useGPU maskOnGPU = gpuArray(mask); bwImageOnGPU = gpuArray(bwImage); t1 = clock; bwResultOnGPU = imdilate(bwImageOnGPU, maskOnGPU, 'same'); wait(gpu); t2 = clock; bwResult = gather(bwResultOnGPU); duration = sprintf('%2.2f', etime(t2, t1)); msg = strcat(['GPU Iteration #', int2str(i),': ''imdilate'' took ', duration, ' seconds']); else t1 = clock; bwResult = imdilate(bwImage, mask, 'same'); t2 = clock; duration = sprintf('%2.2f', etime(t2, t1)); msg = strcat(['CPU Iteration #', int2str(i),': ''imdilate'' took ', duration, ' seconds']); end disp(msg); [~, fileName] = fileparts(tempname); % Mockup file name
save(fileName, 'bwResult', '-v7.3'); end% CPU Iteration #1: 'imdilate' took 24.79 seconds
% CPU Iteration #2: 'imdilate' took 27.06 seconds
% CPU Iteration #3: 'imdilate' took 31.47 seconds
% CPU Iteration #4: 'imdilate' took 29.31 seconds
% CPU Iteration #5: 'imdilate' took 32.48 seconds
% CPU Iteration #6: 'imdilate' took 32.05 seconds
% CPU Iteration #7: 'imdilate' took 32.79 seconds
% CPU Iteration #8: 'imdilate' took 31.57 seconds
% CPU Iteration #9: 'imdilate' took 32.72 seconds
% CPU Iteration #10: 'imdilate' took 27.76 seconds
%
% GPU Iteration #1: 'imdilate' took 253.57 seconds
% GPU Iteration #2: 'imdilate' took 256.04 seconds
% GPU Iteration #3: 'imdilate' took 258.49 seconds
% GPU Iteration #4: 'imdilate' took 255.58 seconds
% GPU Iteration #5: 'imdilate' took 257.73 seconds
% GPU Iteration #6: 'imdilate' took 252.24 seconds
% GPU Iteration #7: 'imdilate' took 260.87 seconds
% GPU Iteration #8: 'imdilate' took 254.85 seconds
% GPU Iteration #9: 'imdilate' took 254.65 seconds
% GPU Iteration #10: 'imdilate' took 251.99 seconds
Best Answer