Solved – Generalized RBF Kernels

kernel trickmetricsvm

There is the notion of Generalized RBF Kernels, for example in
"Towards Optimal Bag-of-Features for Object Categorization and Semantic Video Retrieval" from Jiang (1) or in formula (2.72) in http://agbs.kyb.tuebingen.mpg.de/lwk/sections/section23.pdf (2).

They define: Gen. RBF Kernel: $e^{-pd(x,y)}$, where (1) says d can be any distance function and (2) requires it to be a metric.

I now have been wondering for a long time: Is a function defined by $e^{-pd(x,y)}$ where d is a metric always a Mercer's Kernel? I think in (2) it sounds like it, but if it were true, not many people know of it, since they are still looking to prove that $e^{-EMD(x,y)}$ is a kernel, where EMD = Earth Mover's Distance, which is a metric.

If its not a Mercer's Kernel – do you know of any counter example?

Best Answer

Haasdonk and Bahlmann, Learning with distance substitution kernels (2004; doi, pdf) give us the following result, as a subset of their Proposition 1:

Let $d$ be a pseudosemimetric (i.e. $d$ is symmetric, nonnegative, and $d(x, x) = 0$ for all $x$). Then the kernel $\exp\left( - \gamma d^2(x, y) \right)$ is positive definite for all $\gamma > 0$ iff $d$ is isometrically embeddable in $L_2$.

(Incidentally, this is what I've usually seen called the generalized RBF kernel, with the squared distance. Hat-tip to this related answer for the pointer.)

Thus if $\sqrt d$ is a pseudosemimetric, the kernel $\exp\left( - \gamma d(x, y) \right)$ is positive definite for all $\gamma > 0$ iff $\sqrt d$ is isometrically embeddable in $L_2$. (Of course, if $d$ is a pseudosemimetric, so is $\sqrt d$.)

Two facts about $L_2$ embeddability:

A metric $d$ is $L_2$-embeddable iff $-d^2$ is conditionally positive definite. (Schoenberg, Metric spaces and positive definite functions, 1938, Trans. Am. Math. Soc. 44.3 pp 522–536; pdf)
$-d^\beta$ is conditionally positive definite for all $\beta \in [0, 2]$ iff $d$ is $L_2$-embeddable. (Haasdonk and Bahlmann, Proposition 1)

So:

If $d$ is $L_2$-embeddable, by fact 2 $-d$ is cpd, so by fact 1 $\sqrt d$ is $L_2$-embeddable.

If $d$ is not $L_2$-embeddable, however, fact 2 does not imply that $-d$ is not cpd – only that there is some $\beta$ such that $-d^\beta$ is not cpd. I don't know a nice condition on $d$ that shows $\sqrt d$ is not $L_2$-embeddable.

Best Answer

Related Solutions

Solved – How to understand effect of RBF SVM

Solved – Why is RBF kernel used in SVM

Related Question