Is there any well-founded way of calculating the euclidean distance between two images

image processingmachine learningsimilarities

I need to determine the distance between two images.

Supposing that we are dealing with images of the same size, I think that we can reduce this problem to the square root of the sum of the square of the distance between the pixel $\tt{p1[i,j]}$ of image 1 and the pixel $\tt{p2[i,j]}$ of image 2, with $\tt{i}$ and $\tt{j}$ ranging from 0 to the height and from 0 to the width of the images respectively.

Thus, the underlying problem would be finding the distance between two pixels. And, for doing this, I think that we can simply calculate the Euclidean distance between the RGB components of each pixel.

Does this make sense? Is there some method better than this?

Best Answer

Consider that if we have two matrices of identical shape that we sometimes take their Frobenius norm which is essentially the kind of distance you describe. This would be adequate for black and white images where you need two indices to specify a pixel.

But many images are colour images. They have three indices. Two for height and width, and another for the colour channel. Fortunately, the generalization from the Euclidean norm for vectors and Frobenius norm for matrices is easily generalized to the square root of the sum of squares of the differences in entries of corresponding indices of two multidimensional arrays. I've even seen this generalization simply be called the Euclidean norm or the Frobenius norm for arrays/tensors. It essentially is the Euclidean norm, up to a composition of pairing functions on the index set of the array.

Any of the usual properties of a metric remain, and you could even cluster your images using the resulting distance matrix.


Note that I am casually ignoring the fact that arrays and tensors are not the same. Tensors are multilinear maps, while arrays don't by themselves have any such algebra. But in either case you can still compute this sort of norm when desired.