MATLAB: Speeding up code which integrates a 2D gpuArray matrix using Simpson’s Rule

I have a (real) 2D gpuArray, which I am using as part of a larger code, and now am trying to also integrate the array using the Composite Simpson Rule inside my main loop (several 10000 iterations at least). A MWE looks like the following:

%%%%%%%%%%%%%%%%%% MAIN CODE %%%%%%%%%%%%%%%%%%
Ny = 501;    % Dimensions of matrix M
Nx = 501;    %

dx = 0.1;    % Grid spacings
dy = 0.2;    %
M = rand(Ny, Nx, 'gpuArray'); % Initialise a  matrix
for k = 1:10000
    % M = function1(M)  % Apply some other functions to M
    % ... etc ...
    I = simpsons_integration_2D(M, dx, dy, Nx, Ny); % Now integrate M
end
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

%%%%%%%%%%%%%%%%% INTEGRATOR %%%%%%%%%%%%%%%%%%
function I = simpsons_integration_2D(F, dx, dy, Nx, Ny)
% Integrate the 2D function F with Nx columns and Ny rows, and grid spacings
% dx and dy using Simpson's rule.
% Integrate along x direction (vertically) --> IX is a vector afterwards
sX = sum( F(:,1:2:Nx-2) + 4*F(:,2:2:(Nx-1)) + F(:,3:2:Nx) , 2);
IX = dx/3 * sX;
% Integrate along y direction --> I is a scalar afterwards
sY = sum( IX(1:2:Ny-2) + 4*IX(2:2:(Ny-1)) + IX(3:2:Ny) , 1);
I = dy/3 * sY;
end
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

The operation of performing the integration is around 850 µs, which is currently a significant part of my code. This was measured using

f = @() simpsons_integration_2D(M, dx, dy, Nx, Ny);
t = gputimeit(f)

Is there a way to reduce the execution time for integrating the gpuArray matrix?

(The graphics card is the Nvidia Quadro P4000)

function I = simpsons_integration_2D_v2(F, dx, dy, Nx, Ny) % Avoid summation of large intermediate arrays: sX = F(:,1) + 2 * sum(F(:, 3:2:(Nx-2)), 2) + ... 4 * sum(F(:, 2:2:(Nx-1)), 2) + F(:,Nx); sY = sX(1) + 2 * sum(sX(3:2:(Ny-2))) + ... 4 * sum(sX(2:2:(Ny-1))) + sX(Ny); I = dx * dy / 9 * sY; end function I = simpsons_integration_2D_v3(F, dx, dy, Nx, Ny) % Perform summation as dot product: v = [1, repmat([4, 2], 1, (Nx - 3) / 2), 4, 1]; sX = F * v'; if Nx ~= Ny v = [1, repmat([4, 2], 1, (Ny - 3) / 2), 4, 1]; end sY = v * sX; I = dx * dy / 9 * sY; end

Best Answer

These are faster at least on CPU - I do not have graphics hardware in my laptop which supports GPU arrays:

Compared timeit output on my mobil i5:

 0.00111591015   original
 0.00066621015   without large intermediate array
 0.00012221015   as matrix multiplication

So this is faster on my old 2 core i5 than on your GPU. Maybe this gives a speedup on your GPU also. I assume, in version v3 the vector v must be created on the GPU also.

Best Answer

Related Solutions

MATLAB: How much additional memory is needed to perfrom a 3D FFT other than matrix to be transformed? GPU application.

MATLAB: How to evaluate a double integral using the trapezoidal rule equation

Related Question