I am building an image retrieval system where I would give an input image and the output would be the most similar images from the dataset. I have created a bag of visual word from my dataset with 2000 words. I followed the following process:
1. Extracted SIFT features from the images in the dataset and performed k-means clustering with k = 2000. Then I stored the centroids in a mat file.
2. For an input image, I am creating its bovw as follows:
indices_f=[];for i=1:size(features,1) % this loop is taking too much time.
for j=1:size(centroids,1) d(j)=pdist2(features(i,:),centroids(j,:)); % Compute distance from cluster centroids
end [~,indices_f(i)]=min(d); %Compute Minimum Distance
endsum=1;feature_hist_local=zeros(1,2000);temp=indices_f; %Compute histogram
for m=1:length(temp) feature_hist_local(temp(m))=feature_hist_local(temp(m))+1;end
For my specific use case, the features matrix contains 3127 key points of dimension 128 each. The code is converting it into a bovw representation of size 1×2000. But this code is taking approximately 40 minutes to calculate the bovw representation on a machine with 32 GB RAM and 6 cores.
Is there a better way to do so ? Also, are there alternate approaches improvising upon bag of visual word approaches, since I plan to work with large datasets now.
Best Answer