A kernel is called positive definite (p.d) if its Gram matrix is p.d., i.e. all eigenvalues of the Gram matrix are positive for all possible input vectors in the feature space. My understanding of conditionally positive definite (c.p.d) kernels is that the Gram matrix has to be p.d. not for all but only for a subset of the input vectors from the feature space.
If this definition of conditionally positive definiteness is correct then how do I prove that a kernel is c.p.d.? Would it suffice to show that the Gram Matrix is p.d. for some arbitrary subset of input vectors?
Best Answer
If I'm reading it correctly, your definition of cpd would include any kernel where $k(x, x) > 0$ for any point $x$, by just considering the subset $\{ x \}$.
One way to phrase a matrix being positive definite is that $c^T A c > 0$, for all vectors $c$; a kernel is positive definite if that holds for all Gram matrices $A$. The conditioning in cpd refers to reducing the "for all vectors $c$" part, not the "for all Gram matrices $A$" part. Specifically, the most common definition is reducing it to "for all vectors $c$ with $1^T c = 0$.
These are useful because, just as pd kernels correspond to inner products in some Hilbert space, cpd kernels are closely connected to distances in some Hilbert space. SVMs and kernel PCA both work with any cpd kernel, though a particular optimization algorithm might not.
Here's the full definition from Schölkopf and Smola's Learning with Kernels (2002). Here $\mathbb K$ refers to either $\mathbb R$ or $\mathbb C$.
I don't know a simple eigenvalue-based definition.
To prove a kernel is cpd, you can either prove it directly from the definition above (usually difficult), or you can try to build it out of other cpd kernels using various properties that preserve cpd-ness. Here are some useful connections to pd kernels from that book:
[42] is Berg, Christensen, and Ressel's Harmonic Analysis on Semigroups (1984).
[465] is the classic paper Metric spaces and positive definite functions of Schoenberg (Transactions of the American Mathematical Society 44:522-536, 1938).