MATLAB: Sparse( ix, jx, sx, rx, cx ) implausibly slow

sparse

My understanding was that the five argument sparse call was designed to be a fast way of creating sparse matrices. But in some Kronecker product code I'm using (full code at the bottom, code indirectly derived from here), the profiler consistently reports it as the bottleneck.

Bizarrely, even doing sortrows( [ix,jx,sx] ) first does not speed up the call to sparse, despite the fact that this is basically the internal storage representation for sparse matrices used by Matlab.

Do you have any suggestions for speeding up the call to sparse?

Thanks in advance,

Tom

————————————————————————————————-

Suggested test:

 n=300;A=sprandn(n,n,0.1);B=sprandn(n,n,0.1);
 profile on;X=spkron(A,B);profile off;profile viewer

Required functions:

 function X = spkron( A, B )
    global spkron_use_mex
    [I, J] = size(A);
    [K, L] = size(B);
    [ia,ja,sa] = find( A );
    [ib,jb,sb] = find( B );
    a = double( [ia,ja,sa] );
    b = double( [ib,jb,sb] );
    if isempty( spkron_use_mex )
        [ ix, jx, sx ] = spkron_internal( K,a, L,b );
    else
        [ ix, jx, sx ] = spkron_internal_mex_mex( int32(K),a, int32(L),b );
    end
    X = sparse( ix, jx, sx, I*K, J*L );
 end
 function [ ix, jx, sx ] = spkron_internal( K,a, L,b )
    % derived from alt_kron.m
    ma = max( abs( a(:,3) ) ) * eps;
    mb = max( abs( b(:,3) ) ) * eps;
    a( abs(a(:,3))<mb, : ) = [];
    b( abs(b(:,3))<ma, : ) = [];
    ix = bsxfun(@plus,b(:,1),K*(a(:,1)-1).');
    jx = bsxfun(@plus,b(:,2),L*(a(:,2)-1).');
    sx = bsxfun(@times,b(:,3),a(:,3).');
    sx( abs( sx ) < eps ) = 0;
 end

Best Answer

Using 5 input arguments with the sparse command is indeed an efficient way to construct a sparse matrix. However, you most likely will not be able to speed up the call to sparse, because no matter how the matrix is constructed there is some overhead cost in storing it.

Specifically, MATLAB uses an efficient storage format for sparse matrices. Rather than store the column indices of the non-zero elements, sparse matrices instead store a vector that holds, for each column, the total number of nonzero elements in all preceding columns. The purpose of this is to reduce the overall storage, as in most cases the total number of non-zeros is greater than the number of columns. However, it also means that sparse has to compute the entries of this vector, which introduces overhead.

The suggested test for your code demonstrates quite well why this system is beneficial. When I run the test, I produce the following sparse matrix X:

>> whos X
  Name          Size                    Bytes  Class     Attributes
    X         90000x90000            1175152232  double    sparse

The memory used by X is given by the formula

>> int32(2*8*nnz(X) + (size(X,2)+1)*8)
ans =
  1175152232

The first term is the cost of storing the values of the entries and their row indices, and the second is the cost of storing the column data. Compare this with the cost of storing the both the column and row indices:

>> int32(3*8*nnz(X))
ans =
  1761648336

So MATLAB's format saves 586 MB of memory, but it comes at the cost of a more time-consuming initialization.

Related Solutions

MATLAB: Why tis matrix is sparse

You can define anything you want to be sparse if you so desire. So sparse(ones(1000)) will produce a sparse matrix. ;-)

But seriously, this matrix really is reasonably sparse. Just read what is shown on the page you direct to!

We see that it is shown to be an 1899x1899 square matrix. There are 20296 non-zeros in the matrix.

20296/1899
ans =
       10.688

So on average, roughly 10.7 non-zeros per row of a matrix, where each row has 1899 elements.

20296/1899/1899
ans =
    0.0056281

Total density of around 0.6% non-zeros. Is that matrix sparse? Well, yes, it is. Not massively so, but it is sparse. Since I don't have that matrix, I'll do a little memory comparison on a random one with the same density.

As = sprand(1899,1899,0.0056);
Af = full(As);
whos As Af
  Name         Size                 Bytes  Class     Attributes
  Af        1899x1899            28849608  double              
  As        1899x1899              337568  double    sparse

So we see that storing the matrix as sparse gives me a decrese in memory of almost a factor of 100-1. (Sparse matrix storage also requires we store the location of those elements, so it is not quite as efficient as we might like.)

b = rand(1899,1);
timeit(@() Af*b)
ans =
    0.0023482
    
timeit(@() As*b)
ans =
   6.1858e-05

And working with the matrix with a matrix multiply does give a significant savings.

Much of the time, when you work with a sparse matrix, you will perform a decomposition of some sort. Odds are that will not produce a speed increase here, because the matrix itself is not a nicely tightly banded matrix. I think much of the time, people think of a sparse matrix in the form of something like a tridiagonal matrix. A matrix factorization of a tridiagonal matrix will produce no fill-in at all. On the matrix you show, if we tried to do a Cholesky or LU factoriation, for example, the result will be a virtually completely full triangular matrix.

tic,[L,U] = lu(Af);toc
Elapsed time is 0.205769 seconds.
tic,[L,U] = lu(As);toc
Elapsed time is 3.032499 seconds.

As you see, not all computations on such a matrix will see a gain in speed. Here, we see the result ends up as a nearly full triangular matrix for the factors. So treating that matrix as sparse, IF you intend to do a decomposition of the matrix will be a time loss, because the overhead from the sparse algorithms is too costly.

nnz(L)
ans =
     1177842
nnz(U)
ans =
     1208969
numel(L)
ans =
     3606201

That is to be expected on this matrix. It does not mean the original was not sparse. It is indeed sparse. Just perhaps not as massively sparse as you may want a matrix to be. Sometimes sparse must be measured by the eye of the beholder, in proper context.

spy(As)

MATLAB: Tensor product of three matrices

M = kron(A,kron(B,C))

M = kron(kron(A,B),C)

Best Answer

Related Solutions

MATLAB: Why tis matrix is sparse

MATLAB: Tensor product of three matrices

Related Question