MATLAB: Faster alternative to find or logical indexing for specific problem

findlinear indexinglogical indexingMATLABspeed

I have a massively iterative algorithm in which this code sits so I am profiling it whenever I add something to the algorithm and currently the find or indexing in this piece of code is taking quite a long time so I am wondering if I am missing some simpler approach given the simple nature of my initial array.

I have a predefined array of contiguous indices from 1, I also have an array of indices from these to ignore (note, mine are not random, they are predefined, but change each iteration) and an array of evaluation values corresponding to each of the original indices

I then choose a random 10% sample out of those indices that are not on the ignore list, find the one giving the best evaluation and then wish to return the original index corresponding to this value. So at the moment I do similar to this (albeit with predefined originalIndices, evaluations, indicesToIgnore – they are random here for testing purposes) :

function chosenLinearIdx = chooseSample( )
   originalIndices = 1:300000;
   evaluations = rand( size( originalIndices ) );
   rng( 77 ) % Just in test code for repeatability
   indicesToIgnore = randperm( 300000, 100 );
   ignoreLogicalIdx = ismember( originalIndices, indicesToIgnore );
   validIdx = find( ~ignoreLogicalIdx ); % Slow!
   % validIdx = originalIndices( ~ignoreLogicalIdx ); % Alternative approach, but twice as slow as find
   n = numel( validIdx );
   k = round( 0.1 * n );
   subIndices = ceil( rand( k, 1 ) * n );
   [~, bestIdx] = max( evaluations( validIdx( subIndices ) ) );
   chosenLinearIdx = validIdx( subIndices( bestIdx ) );
end

It is the calculation of those validIdx in the middle that is taking a long time, but I am not seeing a way to map back to my original indices, having chosen an index from the smaller list, without doing this.

It feels as though there should be some quicker method, but either there isn't or I am just missing it. Obviously there are options of using mex or some other C++ type of optimisation if I really want, but I want to make sure I have the best Matlab implementation before I consider those.

Note, my originalIndices are defined just once and stored in a class so they are not being reallocated every time this function is called, which can be of the order of a million times.

Also my original version of this code was using setdiff, but that was far slower so that is not an option I plan to explore again. I have also found cumsum to glow red on the profiler whenever I use that in this massively iterative code so I didn't bother trying a solution using that, which may still need a find anyway.

Best Answer

In case you go the mex route, here is a routine for you for the validIdx calculation:

/* validIdx_mex.c
 *
 * B = validIdx_mex(n,A)
 *
 * Creates a vector of indexes 1:n but excluding those indexes in A.
 *
 * Programmer: James Tursa
 */
/* Includes ----------------------------------------------------------- */
#include "mex.h"
/* Gateway ------------------------------------------------------------ */
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
    mwSize i, j, n, m;
    double *Bpr, *Apr, *bpr;
/* Argument checks */
    if( nrhs != 2 || !mxIsNumeric(prhs[0]) || !mxIsDouble(prhs[1]) ||
        mxGetNumberOfElements(prhs[0]) != 1 || mxIsSparse(prhs[1]) ||
        mxIsComplex(prhs[0]) || mxIsComplex(prhs[1]) ) {
        mexErrMsgTxt("Invalid inputs");
    }
    n = mxGetScalar(prhs[0]);
    if( n <= 0 ) {
        mexErrMsgTxt("Invalid 1st argument");
    }
    m = mxGetNumberOfElements(prhs[1]);
    Apr = mxGetPr(prhs[1]);
/* Create the output */
    plhs[0] = mxCreateDoubleMatrix( 1, n, mxREAL );
    Bpr = mxGetPr(plhs[0]);
/* Fill the output with default indexing */
    bpr = Bpr;
    for( i=1; i<=n; i++ ) {
        *bpr++ = i;
    }
/* Put 0's in the ignore spots */
    for( i=0; i<m; i++ ) {
        j = *Apr;
        if( j != *Apr || j < 1 ) {
            mexErrMsgTxt("Invalid 2nd argument value");
        } else if( j <= n ) {
            Bpr[j-1] = 0;
        }
        Apr++;
    }
/* Shift all the non-zero indexes to the front */
    bpr = Bpr;
    j = 0;
    for( i=0; i<n; i++ ) {
        if( *bpr ) {
            *Bpr++ = *bpr;
            j++;
        }
        bpr++;
    }
/* Set the new size based on the number of actual non-zeros */
    mxSetN(plhs[0],j);
}

Related Solutions

MATLAB: Building Mex file with Fortan

Your main problem is that you need to create the output array PLHS(1) before you access it. Also your NLHS and NRHS types are wrong. Try this:

#include <fintrf.h>
C
C Gateway routine
C
    SUBROUTINE MEXFUNCTION(NLHS, PLHS, NRHS, PRHS)
    implicit none
C mexFunction arguments:
    MWPOINTER PLHS(*), PRHS(*)
    INTEGER NLHS, NRHS
    mwPointer, external :: mxGetPr, mxGetPi
    mwPointer, external :: mxCreateDoubleMatrix, mxCreateNumericArray
    mwSize, external :: mxGetM, mxGetN
    MWPOINTER YPP 
    mwSize M, N
    integer*4 ComplexFlag
    REAL *8 RYPP(1)
    RYPP(1) = 10
    M = 1
    N = 1
    ComplexFlag = 0
    PLHS(1) = mxCreateDoubleMatrix( M, N, ComplexFlag )
    YPP = MXGETPR(PLHS(1))
    CALL MXCOPYREAL8TOPTR(RYPP, YPP, M*N)
    RETURN
    END

MATLAB: A sparse matrix vector multiplication in C mex programming

Here is the bare bones C code for this. Most of the code consists of argument checking etc. The actual computation part of the code is only a few lines and is fairly trivial. It simply accumulates the individual element multiplies into the appropriate resulting element. The j for-loop loops over the columns of M, and the while loop does the individual element multiplies within each column.

/* Bare bones sparse matrix * full vector                               */
/* Produces a full vector result                                        */
/* Assumes all input elements are finite                                */
/* E.g., does not account for NaN or 0*inf etc.                         */
/* Programmer:  James Tursa                                             */
/* Includes ----------------------------------------------------------- */
#include "mex.h"
/* Gateway ------------------------------------------------------------ */
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
    mwSize m, n, j, nrow;
    double *Mpr, *Vpr, *Cpr;
    mwIndex *Mir, *Mjc;
/* Argument checks */
    if( nlhs > 1 ) {
        mexErrMsgTxt("Too many outputs");
    }
    if( nrhs != 2 ) {
        mexErrMsgTxt("Need exactly two inputs");
    }
    if( !mxIsDouble(prhs[0]) || !mxIsSparse(prhs[0]) || mxIsComplex(prhs[0]) ) {
        mexErrMsgTxt("1st argument must be real double sparse matrix");
    }
    if( !mxIsDouble(prhs[1]) || mxIsSparse(prhs[1]) || mxIsComplex(prhs[1]) ||
        mxGetNumberOfDimensions(prhs[1])!=2 || mxGetN(prhs[1])!=1 ) {
        mexErrMsgTxt("2nd argument must be real double full column vector");
    }
    m = mxGetM(prhs[0]);
    n = mxGetN(prhs[0]);
    if( mxGetM(prhs[1])!=n ) {
        mexErrMsgTxt("Inner matrix dimensions must agree.");
    }
/* Create output */
    plhs[0] = mxCreateDoubleMatrix( m, 1, mxREAL );
/* Get data pointers */
    Mpr = mxGetPr(prhs[0]);
    Vpr = mxGetPr(prhs[1]);
    Cpr = mxGetPr(plhs[0]);
    Mir = mxGetIr(prhs[0]);
    Mjc = mxGetJc(prhs[0]);
/* Calculate result */
    for( j=0; j<n; j++ ) {
        nrow = Mjc[j+1] - Mjc[j];  /* Number of row elements for this column */
        while( nrow-- ) {
            Cpr[*Mir++] += *Mpr++ * Vpr[j];  /* Accumulate contribution of Vpr[j] */
        }
    }
}

Best Answer

Related Solutions

MATLAB: Building Mex file with Fortan

MATLAB: A sparse matrix vector multiplication in C mex programming

Related Question