MATLAB: How `gpuArray` save sparse matrix when running Preconditioned conjugated gradient

gpuarrayMATLABpcgsparse matrix

Hi, I am using cuda in Matlab to accelerate the Preconditioned conjugated gradient evaluation of "Ax = b". I'm glad to find the pcg without any preconditioner on GPU run faster (x6~7) than ichol preconditioned pcg on CPU. I would like to know how gpuArray allocate the sparse matrix on GPU, in CSR, ELL or any other format. I heard that the different storage format influences the evaluation speed. So I would like to compare these formats on my matrix to optimal my code. I found no option of these formats' setting in the function of gpuArray. I uncertainly speculate gpuArray may allocate the sparse matrix dynamically. Could you give some suggestion or document link of this problem?
Thank you.

Best Answer

gpuArray currently stores sparse matrices internally in CSR format. This matches the NVIDIA cusparse routines that are used for basic algebra.
I don't know quite what you mean by dynamic allocation. All MATLAB variables are allocated dynamically in some sense, because they are not defined before the application is run. However, MATLAB uses a variety of pooling techniques to ensure actual dynamic allocations (such as calls to cudaMalloc) happen as infrequently as possible. If you are noticing some performance delays when data is copied to the device then sometimes the conversion between CSC (the CPU storage format) and CSR is responsible.