MATLAB: Efficient method for finding index of closest value in very large array for a very large amount of examples

arrayfindspeed

I have two very large one dimensional arrays, 'aRef' which is around 11,000,000 elements and 'aTest' which is around 10,000,000 elements. I need to find the index of the closest element in 'aRef' for all elements in 'aTest'. 'aRef' is sorted and 'aTest' can be sorted if that will help performance.
Method 1: Returns at out of memory error as the arrays are far too large
diff = abs(bsxfun(@minus,aRef,aTest'));
[~, I] = min(diff);
Method 2: Takes around 0.03 seconds per iteration (but varies greatly) and therefore around 300000 seconds in total
for k = 1:n
diff = abs(aRef- aTest(k));
[~, I(k)] = min(diff);
end
Method 3: Takes around 0.013 seconds per iteration and therefore 130000 seconds in total
for k = 1:n
i_lower = find(aRef <= aTest(k),1,'last');
i_higher = find(aRef >= aTest(k),1,'first');
end
Is there a more efficient method for this that won't exhaust the memory or take so long to run?
Thanks for your help.

Best Answer

Note: Using diff as a variable name is not a good idea as it shadows the very useful diff function. Also, for method 2, your code does not show the preallocation of I. If you don't preallocate I, it will seriously slow down the code.
Anyway, for two vectors of around 10,000 elements, the following is around 200 times faster than your method 1 on my machine.
edges = [-Inf, mean([aRef(2:end); aRef(1:end-1)]), +Inf];
I = discretize(aTest, edges);
Basically, it construct an edge vector half way between each elements of your aRef, and use the histogram functions of matlab to get the bin index your aTest would fall in. discretize is new in R2015a. On 2014b, you can use the third return value of hiscounts. On even older versions, the 2nd return value of histc (although histc behaves slightly differently with regards to the last bin).
%2014b
[~, ~, I] = histcounts(aTest, edges); %probably slower than discretize
%before 2014b
[~, I] = histc(aTest, edges); %return an extra element (for the +Inf bin)
I(end) = [];