I have a (real) 2D gpuArray, which I am using as part of a larger code, and now am trying to also integrate the array using the Composite Simpson Rule inside my main loop (several 10000 iterations at least). A MWE looks like the following:
%%%%%%%%%%%%%%%%%% MAIN CODE %%%%%%%%%%%%%%%%%%
Ny = 501; % Dimensions of matrix M
Nx = 501; %
dx = 0.1; % Grid spacings
dy = 0.2; %M = rand(Ny, Nx, 'gpuArray'); % Initialise a matrix
for k = 1:10000 % M = function1(M) % Apply some other functions to M
% ... etc ...
I = simpsons_integration_2D(M, dx, dy, Nx, Ny); % Now integrate M
end%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%% INTEGRATOR %%%%%%%%%%%%%%%%%%
function I = simpsons_integration_2D(F, dx, dy, Nx, Ny)% Integrate the 2D function F with Nx columns and Ny rows, and grid spacings
% dx and dy using Simpson's rule.
% Integrate along x direction (vertically) --> IX is a vector afterwards
sX = sum( F(:,1:2:Nx-2) + 4*F(:,2:2:(Nx-1)) + F(:,3:2:Nx) , 2);IX = dx/3 * sX;% Integrate along y direction --> I is a scalar afterwards
sY = sum( IX(1:2:Ny-2) + 4*IX(2:2:(Ny-1)) + IX(3:2:Ny) , 1);I = dy/3 * sY;end%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
The operation of performing the integration is around 850 µs, which is currently a significant part of my code. This was measured using
f = @() simpsons_integration_2D(M, dx, dy, Nx, Ny);t = gputimeit(f)
Is there a way to reduce the execution time for integrating the gpuArray matrix?
(The graphics card is the Nvidia Quadro P4000)
Best Answer