MATLAB: Does the gather() function only take around 0.001 seconds at command window, while 1s in a loop

gpu gather() function

Hi everyone,
I wrote an algorism to solve a set of pde. There are some very long equations in the middle of while loop, and I tried to calculated these equations by GPU for speeding up, then the results from these equations are further employed. It runs successful, however, the gather() function takes almost 1/5 of total time (3000s). I tried to use "[dk1,dk2,dk3,dk4,dk5,dk6,dk7,dk8,dk9] = dkfunction(S,T)" in command window, it just cost 0.001s. I am confused why this code in a big loop performs badly? I want to know if there is a way to speed up gather() function, or an alternative function? By the way, the size of array in mex file should be constant or not?
(Note: my code is very long, so I attatch the profiler and core part (simplified) of my question. dkfunction is related to cuda. )
while t<24*3600*250
%%%



%Omitted code for S and T calculation.
%%%
[dk1,dk2,dk3,dk4,dk5,dk6,dk7,dk8,dk9] = dkfunction(S,T);
[dkvTdT,dkvTdS,dkvhdT,dkvhdS,dphaivdT,dphaivdS,dkhdS,dkdS,dphaidS]=gather(dk1,dk2,dk3,dk4,dk5,dk6,dk7,dk8,dk9);
%%%
%Omitted code to use dkvTdt..... to calculate other variables and t steps.
%%%
end

Best Answer

gather waits for the gpu to finish. When you are working on the command line, you already started the gpu work and it probably takes you a couple of seconds to enter the command to gather in a timed way, and in the meantime the gpu kept working and already had an answer.
Also you should use gputimeit() for gpu timing.
Related Question