Solved – Normalized cross correlation vs Euclidean distance in template matching

cross correlationdistanceimage processingpattern recognition

What is the difference between normalized cross-correlation and Euclidean distance in pattern recognition? — especially if we want to do recognition with template matching.

I understand about Euclidean distance. It's like calculate $\sqrt{\sum(x_2-x_1)^2+..+(x_n-x_{n-1}))}$

And the small distance in Euclidean distance is recognised as the new label of the test image. But how about normalized cross correlation??

How to calculate it, what does is it represent, what are the drawbacks and the advantage of using that method? Which method is the best???

Best Answer

For two vectors $v_i$ and $v_j$ with length $n$,

1, when the two vectors are normalized to zero mean and unit length ($v \leftarrow \frac{v-\bar{v}}{||v-\bar{v}||_2}$), their Pearson correlation coefficient $r(=corr(v_i, v_j))$ relates to their Euclidean distance $d(=||v_i-v_j||_2)$ by $r=1-d^2/2$, or equally $r=v_i^Tv_j$. Reference: http://t.cn/RL5JcKt.

2, when the two vectors are normalized to zero mean and unit standard deviation (or unit variance) ($v \leftarrow \frac{v-\bar{v}}{std(v-\bar{v})}$), then $r=1-d^2/(2*(n-1))$, or equally $r=v_i^Tv_j/(n-1)$.

Related Solutions

Solved – Use Edge detection in Image classification

Your approach goes in the line of the popular histogram of gradients approach. See here and the corresponding Wikipedia entry. Now unless you have some already labelled data, training such a system is quite laborious. If possible, I would start by using some available implementation to experiment with, like the one offered by scikit-image.

There are some other features, like Linear Binary Pattern, but they're not as powerful as HOG. See in the module corresponding of scikit-image for a list of features and their implementations.

As for CNN, you should not need to extract any features. The system learns the features automatically. That is one of the nice properties of deep architectures. A huge number of papers show that these systems learn some edge oriented filters features (in the same line as the idea you are considering).

Note that these features do not consider color. That may be an interesting feature for you to consider. Or extract the features for each of the color channels.

Hope this helps.

Centering in normalized cross correlation for template matching

This is similar to the difference between Pearson correlation and cosine similarity.

As explained here for example, the Pearson correlation is the cosine similarity between two demeaned vectors. So the normalized cross-correlation that you show is related to a Pearson correlation, while your proposal is related to a more general cosine similarity.

The advantage of demeaning is removing influence from overall levels. To illustrate with a simple example, generate two (ideally) uncorrelated vectors from a standard normal distribution (mean 0, standard deviation 1) in R.

set.seed(101)
f0 <- rnorm(100)
t0 <-rnorm(100)

Define a function to do the cosine similarity (no demeaning), and compare against the Pearson correlation (cor() function):

cossim <-function(x,y) sum(x*y)/sqrt(sum(x^2)*sum(y^2))
cor(f0,t0)
# [1] 0.1078112
cossim(f0,t0)
# [1] 0.1093093

These aren't exactly 0, due to random sampling.

Now just add 4 units to both of these poorly correlated vectors (by either measure) and see what happens.

f4 <- f0 + 4
t4 <- t0 + 4
cor(f4,t4)
# [1] 0.1078112
cossim(f4,t4)
# [1] 0.9499962

Demeaning keeps the Pearson correlation at its original value despite the shifts in overall levels, but you now find almost perfect cosine similarity without demeaning.

Best Answer

Related Solutions

Solved – Use Edge detection in Image classification

Centering in normalized cross correlation for template matching

Related Question