MATLAB: GpuArray sparse memory usage – Math Solves Everything

I have a gpu with about 2GB of available memory:

 CUDADevice with properties:
                      Name: 'Quadro K1100M'
                     Index: 1
         ComputeCapability: '3.0'
            SupportsDouble: 1
             DriverVersion: 6.5000
            ToolkitVersion: 6.5000
        MaxThreadsPerBlock: 1024
          MaxShmemPerBlock: 49152
        MaxThreadBlockSize: [1024 1024 64]
               MaxGridSize: [2.1475e+09 65535 65535]
                 SIMDWidth: 32
               TotalMemory: 2.1475e+09
           AvailableMemory: 2.0154e+09
       MultiprocessorCount: 2
              ClockRateKHz: 705500
               ComputeMode: 'Default'
      GPUOverlapsTransfers: 1
    KernelExecutionTimeout: 1
          CanMapHostMemory: 1
           DeviceSupported: 1
            DeviceSelected: 1

However, I'd like to load a sparse array into it (R2015A, which supports sparse GPUarray):

whos('pxe')
Name            Size                      Bytes  Class     Attributes     
pxe       5282400x5282400            1182580904  double    sparse, complex

I get an error upon trying to copy it to GPU though:

gpxe = gpuArray(pxe);
Error using gpuArray
An unexpected error occurred on the device. The error code was: UNKNOWN_ERROR.

I'm not sure what the problem here is? Trying it with smaller sized sparse arrays will work, but I'm still well within the memory limits here. Is there some kind of hidden maximum size, or is it that we are not allowed to actually use most of the GPU memory? This would theoretically take up less than 60% of GPU memory.

Edit: trying smaller arrays and loading multiple ones into GPU memory:

Trial>> gpu = gpuDevice;
Trial>> mem1 = gpu.FreeMemory;
Trial>> gpxe = gpuArray(pxet.');
Trial>> mem2 = gpu.FreeMemory;
Trial>> gpye = gpuArray(pyet.');
Trial>> mem3 = gpu.FreeMemory;
Trial>> gpxi = gpuArray(pxit.');
Trial>> mem4 = gpu.FreeMemory;
Trial>> gpyi = gpuArray(pyit.');
Trial>> mem5 = gpu.FreeMemory;

Sizes of these arrays are theoretically:

whos('pxet','pyet','pxit','pyit')
Name           Size                   Bytes  Class     Attributes     
pxet      211600x211600            47266024  double    sparse, complex
pxit      211600x211600            47266024  double    sparse, complex
pyet      211600x211600            47266024  double    sparse, complex
pyit      211600x211600            47266024  double    sparse, complex

Sequential memory footprint in the GPU:

Trial>> mem1-mem2
 ans =
   147456000
 Trial>> mem2-mem3
 ans =
    39059456
 Trial>> mem3-mem4
 ans =
    39059456
 Trial>> mem4-mem5
 ans =
    39059456

So the very first one preallocates a huge chunk of memory, and subsequent ones take up less space than they should? Seems to me like I need to have enough GPU memory to fit the initial preallocation that's about 3 times as big as it needs to.

Best Answer

The first time you start up any of the GPU support within MATLAB, a series of libraries are loaded, and these consume memory on the GPU. Sparse gpuArray uses a different representation compared to the CPU (it uses CSR layout, and 4-byte integers for indices) which explains why the number of bytes consumed by a given sparse matrix is different on the GPU and the CPU. Converting between these formats requires additional storage on the GPU, which almost certainly explains why you cannot create the large sparse matrix on the GPU.

Best Answer

Related Solutions

MATLAB: Sparse arrays on GPU

MATLAB: Is NVIDIA Quadro P500 good enough to use parallel computing toolbox

Related Question