MATLAB: Faster alternative to find or logical indexing for specific problem

findlinear indexinglogical indexingMATLABspeed

I have a massively iterative algorithm in which this code sits so I am profiling it whenever I add something to the algorithm and currently the find or indexing in this piece of code is taking quite a long time so I am wondering if I am missing some simpler approach given the simple nature of my initial array.
I have a predefined array of contiguous indices from 1, I also have an array of indices from these to ignore (note, mine are not random, they are predefined, but change each iteration) and an array of evaluation values corresponding to each of the original indices
I then choose a random 10% sample out of those indices that are not on the ignore list, find the one giving the best evaluation and then wish to return the original index corresponding to this value. So at the moment I do similar to this (albeit with predefined originalIndices, evaluations, indicesToIgnore – they are random here for testing purposes) :
function chosenLinearIdx = chooseSample( )
originalIndices = 1:300000;
evaluations = rand( size( originalIndices ) );
rng( 77 ) % Just in test code for repeatability
indicesToIgnore = randperm( 300000, 100 );
ignoreLogicalIdx = ismember( originalIndices, indicesToIgnore );
validIdx = find( ~ignoreLogicalIdx ); % Slow!
% validIdx = originalIndices( ~ignoreLogicalIdx ); % Alternative approach, but twice as slow as find
n = numel( validIdx );
k = round( 0.1 * n );
subIndices = ceil( rand( k, 1 ) * n );
[~, bestIdx] = max( evaluations( validIdx( subIndices ) ) );
chosenLinearIdx = validIdx( subIndices( bestIdx ) );
end
It is the calculation of those validIdx in the middle that is taking a long time, but I am not seeing a way to map back to my original indices, having chosen an index from the smaller list, without doing this.
It feels as though there should be some quicker method, but either there isn't or I am just missing it. Obviously there are options of using mex or some other C++ type of optimisation if I really want, but I want to make sure I have the best Matlab implementation before I consider those.
Note, my originalIndices are defined just once and stored in a class so they are not being reallocated every time this function is called, which can be of the order of a million times.
Also my original version of this code was using setdiff, but that was far slower so that is not an option I plan to explore again. I have also found cumsum to glow red on the profiler whenever I use that in this massively iterative code so I didn't bother trying a solution using that, which may still need a find anyway.

Best Answer

In case you go the mex route, here is a routine for you for the validIdx calculation:
/* validIdx_mex.c
*
* B = validIdx_mex(n,A)
*
* Creates a vector of indexes 1:n but excluding those indexes in A.
*
* Programmer: James Tursa
*/
/* Includes ----------------------------------------------------------- */
#include "mex.h"
/* Gateway ------------------------------------------------------------ */
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
mwSize i, j, n, m;
double *Bpr, *Apr, *bpr;
/* Argument checks */
if( nrhs != 2 || !mxIsNumeric(prhs[0]) || !mxIsDouble(prhs[1]) ||
mxGetNumberOfElements(prhs[0]) != 1 || mxIsSparse(prhs[1]) ||
mxIsComplex(prhs[0]) || mxIsComplex(prhs[1]) ) {
mexErrMsgTxt("Invalid inputs");
}
n = mxGetScalar(prhs[0]);
if( n <= 0 ) {
mexErrMsgTxt("Invalid 1st argument");
}
m = mxGetNumberOfElements(prhs[1]);
Apr = mxGetPr(prhs[1]);
/* Create the output */
plhs[0] = mxCreateDoubleMatrix( 1, n, mxREAL );
Bpr = mxGetPr(plhs[0]);
/* Fill the output with default indexing */
bpr = Bpr;
for( i=1; i<=n; i++ ) {
*bpr++ = i;
}
/* Put 0's in the ignore spots */
for( i=0; i<m; i++ ) {
j = *Apr;
if( j != *Apr || j < 1 ) {
mexErrMsgTxt("Invalid 2nd argument value");
} else if( j <= n ) {
Bpr[j-1] = 0;
}
Apr++;
}
/* Shift all the non-zero indexes to the front */
bpr = Bpr;
j = 0;
for( i=0; i<n; i++ ) {
if( *bpr ) {
*Bpr++ = *bpr;
j++;
}
bpr++;
}
/* Set the new size based on the number of actual non-zeros */
mxSetN(plhs[0],j);
}
Related Question