MATLAB: Is it normal that the computation time on GPU for the first time be longer than the running it for more than 2 times? (run the code for first time on GPU and also after reset the device takes more time but after 2 or 3 times it shows the short time

gpu computational time

% vector algebra on CPU
t = tic();
v1 = Vector(X,Y,Z,Vx,Vy,Vz);
v11 = Vector(X,-Y,Z,-Vx,Vy,Vz);
v111 = v1+v11;
v1111 = v1*v11;
cpuTime = toc( t )
% vector algebra on GPU
t = tic();
v2 = Vector(gpuArray(X),gpuArray(Y),gpuArray(Z),gpuArray(Vx),gpuArray(Vy),gpuArray(Vz));
v22 = Vector(gpuArray(X),gpuArray(-Y),gpuArray(Z),gpuArray(-Vx),gpuArray(Vy),gpuArray(Vz));
v222 = v2+v22;
v2222 = v2*v22;
v222 = gather( v222 ); % Fetch the data back from the GPU
v2222 = gather( v2222 );
gpuTime = toc( t )

Best Answer

Firstly, yes, it's perfectly normal. The first time you call a GPU function, the GPU libraries are loaded into MATLAB, which takes several seconds. The first time you create some data on a newly reset device memory has to be allocated, which takes time; but the next time it will use device memory from the alloc pool, and so won't take as long. And when you do a group of elementwise operations such as +, some just-in-time compilation is going that won't need to happen again the second time around.
Secondly, make sure you are measuring time on the GPU correctly, calling wait(gpuDevice) before and after your GPU code, or using gputimeit. This ensures all the GPU operations being timed are the ones you intended to time. See this bit of the doc.
Thirdly, I don't know what X, Y, Z etc are in your code, but your use of this function (or class?) Vector implies that they may be scalars. If they are scalars and Vector represents some kind of numeric array, then I'd like to know more about what you're doing, because on the face of it you should be using a normal MATLAB array ( [X, Y, Z, Vx, Vy, Vz] ). For instance, why are you calling gpuArray on individual scalar values rather than sending all your data to the GPU in a single array ( v2 = gpuArray(v1) )? The GPU is only worth using if you are processing arrays with many thousands of values.