MATLAB: How to speed up Bag of Visual Words Calculation

bag of wordsfeature vectorsMATLABsiftspeedStatistics and Machine Learning Toolbox

I am building an image retrieval system where I would give an input image and the output would be the most similar images from the dataset. I have created a bag of visual word from my dataset with 2000 words. I followed the following process:

1. Extracted SIFT features from the images in the dataset and performed k-means clustering with k = 2000. Then I stored the centroids in a mat file.

2. For an input image, I am creating its bovw as follows:

indices_f=[];
for i=1:size(features,1) % this loop is taking too much time.
    for j=1:size(centroids,1)
        d(j)=pdist2(features(i,:),centroids(j,:)); % Compute distance from cluster centroids
    end
    [~,indices_f(i)]=min(d); %Compute Minimum Distance
end
sum=1;
feature_hist_local=zeros(1,2000);
temp=indices_f;   
%Compute histogram
for m=1:length(temp)
    feature_hist_local(temp(m))=feature_hist_local(temp(m))+1;
end

For my specific use case, the features matrix contains 3127 key points of dimension 128 each. The code is converting it into a bovw representation of size 1×2000. But this code is taking approximately 40 minutes to calculate the bovw representation on a machine with 32 GB RAM and 6 cores.

Is there a better way to do so ? Also, are there alternate approaches improvising upon bag of visual word approaches, since I plan to work with large datasets now.

Best Answer

Looks like I glanced over the code too quickly and missed the key point, perhaps...

for i=1:size(features,1)
  for j=1:size(centroids,1)
     d(j)=pdist2(features(i,:),centroids(j,:));
     ...

In the above you're computing the full array of comparative distances on a row-by-row basis but then you only save the results for one column; the last index for i; all the rest are thrown away. I presume this isn't really what you intend? If you're really looking for the minimum across all combinations, then you get the same result at far better performance with

d=pdist2(features,centroids);

and since you're looking later on for the minimum, then the previous comment regarding the optional parameters is useful

 [d,idxMin]=pdist2(features,centroids,'euclidean','smallest',1);

Best Answer

Related Solutions

MATLAB: Possibility to add inputs

MATLAB: Index exceeds matrix dimensions.HELP!

Related Question