I am wondering if anyone knows of a toolbox function that can perform this task quickly. I am not familiar with the Stats toolbox, but I have a feeling something could be there. Or maybe even the Image Processing Toolbox?
Here is the basic problem, and a naive solution. I want to count the occurrences of each unique item in a vector in the order in which they appear.
N = 5e5; % Typical _minimum_ size, max N = 5e7.
C = 50; % Best case shown, worst case: C = 1.5;
A = randi(floor(N/C),1,N); % Data looks like this.
% Now to produce our results. Need to find B.
Au = unique(A);H = histc(A,Au);B = zeros(size(A));for ii = 1:length(Au) B(A==Au(ii)) = 1:H(ii);end
As a simple example for clarity, for N=10, C=3:
[A(:) B(:)]ans = 3 1 2 1 3 2 3 3 3 4 2 2 3 5 3 6 3 7 1 1
I have written code that reduces the run-time by a large factor with the worst case C and minimum N, but it is a bit convoluted. After doing so I thought there might already be something like this out there, perhaps in a toolbox, but I couldn't find it. Perhaps there is a way to make further improvements for larger N.
Thanks!
EDIT I have not found any other solution to the this, so I may delete the question….
Best Answer