Hi, I am using cuda in Matlab to accelerate the Preconditioned conjugated gradient evaluation of "Ax = b". I'm glad to find the pcg without any preconditioner on GPU run faster (x6~7) than ichol preconditioned pcg on CPU. I would like to know how gpuArray allocate the sparse matrix on GPU, in CSR, ELL or any other format. I heard that the different storage format influences the evaluation speed. So I would like to compare these formats on my matrix to optimal my code. I found no option of these formats' setting in the function of gpuArray. I uncertainly speculate gpuArray may allocate the sparse matrix dynamically. Could you give some suggestion or document link of this problem?
Thank you.
Best Answer