MATLAB: Alternative Matrix Multiplication which does euclidean distance instead of dot product

matrix multiplcation eucledean distance

Of course, a great plus of matlab is to be able to do a matrix multiplication without writing 3 loops, and with lots of syntax sugar (for example, imagine having to write C = A.mult(B) each time). Moreover, it will be fast because the interpreter just reads 3 symbols (namely: A*B) and turn to assembly or whatever low level code matlab uses there.
However, what I often have is that I have a set of data points of multiple dimensions, for example 3, and another set of data points, again with for example 3 dimensions. Then I want to compare each data point of one set to another using a distance measure. If my distance measure is a dot product, then of course a matrix product would suffice:
A : m x 3 matrix
B : 3 x n matrix
C=A*B, then C(i,j) is the distance between data point i in set A and data point j in set B.
But what if my distance measure is euclidean? Then I would like to be able to do the same. However, whenever the * operation does a multiplication, it should subtract and square, and whenever the * operation does a sum, I also want it to do a sum. C=A*B would create a simular matrix as above, yet using euclidean distance.
Of course, I can simulate this writing a function, maybe one forloop and some matrix magic. However, I want (1) syntax sugar and (2) fast c/assembly implementation. Moreover, I think such a use case is actually quite realistic. I often want to find for each x out of M data points to which of some other reference database of N data points, x lies closest (using euclidean distance). For example in k-means when you try to decide to wich centroid each data point belongs.
So, does this exist? Would this be a nice feature? Or am I the only one who sees this as a 'nice to have'? Should I write a c-file for this and mex it?

Best Answer

You can get something like what you're after using this
and using my INTERDISTS utiltity below. For example
>> A=rand(2,3), B=rand(3,2),
A =
0.9448 0.4893 0.9001
0.4909 0.3377 0.3692
B =
0.1112 0.2417
0.7803 0.4039
0.3897 0.0965
>> Ops.mtimes=@(a,b)interdists(a.',b);
>> A=DataObj(A,'Ops',Ops); B=DataObj(B,'Ops',Ops);
>> Euclidean = A*B
Euclidean =
1.0198 1.0712
0.5834 0.3753
My overall experience though is that syntactic sugar is rarely ever worth it. Better just to use the interdists function directly.
function Graph=interdists(A,B)
%Finds the graph of distances between point coordinates
%




% (1) Graph=interdists(A,B)
%
% in:
%




% A: matrix whose columns are coordinates of points, for example
% [[x1;y1;z1], [x2;y2;z2] ,..., [xM;yM;zM]]
% but the columns may be points in a space of any dimension, not just 3D.
%
% B: A second matrix whose columns are coordinates of points in the same
% Euclidean space. Default B=A.
%
%
% out:
%
% Graph: The MxN matrix of separation distances in l2 norm between the coordinates.
% Namely, Graph(i,j) will be the distance between A(:,i) and B(:,j).
%
%
% (2) interdists(A,'noself') is the same as interdists(A), except the output
% diagonals will be NaN instead of zero. Hence, for example, operations
% like min(interdists(A,'noself')) will ignore self-distances.
%
% See also getgraph
noself=false;
if nargin<2
B=A;
elseif ischar(B)&&strcmpi(B,'noself')
noself=true;
B=A;
end
N=size(A,1);
B=reshape(B,N,1,[]);
Graph=l2norm(bsxfun(@minus, A, B),1);
Graph=squeeze(Graph);
if noself
n=length(Graph);
Graph(linspace(1,n^2,n))=nan;
end