Hello. I'm trying to gain an understanding of how much memory is needed to perform an FFT, and if it is different with respect to performing it on a GPU.
For instance, it appears I can only utilize up to 67% of my GPU memory before an error is thrown. I can't seem to go above this value
clear allNx = 256;Ny = 256;Nz = 512;A = rand(Nx,Ny,Nz)+1i*rand(Nx,Ny,Nz);A = gpuArray(A);A = fftn(A);B = rand(Nx,Ny,Nz)+1i*rand(Nx,Ny,Nz);B = gpuArray(B);B = fftn(B);C = rand(Nx,Ny,Nz)+1i*rand(Nx,Ny,Nz);C = gpuArray(C);C = fftn(C);D = rand(Nx,Ny,Nz)+1i*rand(Nx,Ny,Nz);D = gpuArray(D);D = fftn(D);E = rand(Nx,Ny,Nz)+1i*rand(Nx,Ny,Nz);E = gpuArray(E);E = fftn(E);F = rand(Nx,Ny,Nz)+1i*rand(Nx,Ny,Nz);F = gpuArray(F);F = fftn(F);G = rand(Nx,Ny,Nz)+1i*rand(Nx,Ny,Nz);G = gpuArray(G);G = fftn(G);H = rand(Nx,Ny,Nz)+1i*rand(Nx,Ny,Nz);H = gpuArray(H);H = fftn(H);I = rand(Nx,Ny,Nz)+1i*rand(Nx,Ny,Nz);I = gpuArray(I);I = fftn(I);J = rand(Nx,Ny,Nz)+1i*rand(Nx,Ny,Nz);J = gpuArray(J);J = fftn(J);bytes = 16; % Bytes used for complex number
Tbytes = Nx*Ny*Nz*bits; % Total number of Bytes
NoTran = 10; % Number of FFT transforms in memory
GPUmem = 8e9; % 8 GBytes of GPU memory
% Theoretical percentage of GPU memory used with all transforms
percent = (Tbytes/GPUmem)*NoTran*100; ans = 67.1089
If I add another matrix, let's say 'K' in the same way the other matricies were contructed, an error is then thrown.
If call the GPU it appears I obtain a different answer than my calculation
gpuDeviceans = CUDADevice with properties: Name: 'GeForce RTX 2070 with Max-Q Design' Index: 1 ComputeCapability: '7.5' SupportsDouble: 1 DriverVersion: 1.0200e+01 ToolkitVersion: 1.0100e+01 MaxThreadsPerBlock: 1024 MaxShmemPerBlock: 49152 MaxThreadBlockSize: [1024 1024 64] MaxGridSize: [2.1475e+09 65535 65535] SIMDWidth: 32 TotalMemory: 8.5899e+09 AvailableMemory: 1.5127e+09 TotalMemory = 8.5899e+09AvailableMemory = 1.5127e+09% Percentage of GPU memory used
percent = (1 - AvailableMemory/TotalMemory)*100ans = 82.390
This answer is somewhat confusing as I made sure to only enable my computer's integrated graphics rather than the GPU. Making changes to this setting in NVIDIA control panel does not appear to change 'AvailableMemory' if I rerun all the matrices and check available memory.
So my calculation for 'Tbytes' is wrong as it appears more memory is being used. Additionally, it appears there are 8.6 GBytes of total memory available on the GPU – I'm not going to complain about that.
So, how much additional memory is needed to perform a 3D FFT in matlab other than the starting matrix, and does performing one on a GPU make a difference?
That is, for some matrix A consisting of comlex numbers and of size (Nx*Ny*Nz) – Theoretically it should require (Nx*Ny*Nz)*16 bytes of memory. However in order to do a 3D FFT on that matrix, I believe it should require at least double that amount of memory when considering the transform matrix (including the zeros of that transform matrix). But it seems even more memory than that is required.
Best Answer