Solved – Compute probability from distance-score

distanceprobabilitysimilaritiessoftmax

I compute Euclidian distances between a point I want to analyze and a set of points I have. I want to sort my points by decreasing "similarity".

I used to compute a "score" by inverting the distance ($s=1/d$), and use the $\cfrac{s_i}{\sum_k s_k}$ as a similarity that varies between $0$ and $1$.

I have seen that the $softmax$ function can also be used, the difference being that it uses $e^{1/d}$ as the score.

Which function would be closer to computing a kind of "probability"? I should apologize for probably mixing terms…

Best Answer

Credits to @Dougal for his comment-answer:

Both give nonnegative scores that sum to one. Neither is particularly anything like a probability. Softmax will decay quickly beyond the closest point, while the sum one will distribute scores more widely.

The decay part is interesting: $e^{1/x}$ is a lot bigger than $1/x$, so ${\sum_k e^x}$ will be a lot bigger than ${\sum_k 1/x}$, considering the value of $k$.

Related Question