Solved – Prove that this kernel is a valid kernel

kernel trickmachine learning

How would you argument or prove that this is a valid kernel:

$$ K_a(x, t) = \prod_{i=1}^{n} (1 + x_it_i + (1-x_i)(1-t_i)). $$

I know that there are two conditions that a kernel must satisfy to be a valid kernel: symmetry and positive-semidefiniteness. Clearly the first condition is satisfied. But I am not sure how to prove the second.

I have re-arranged the expression as:

$$ K_a(x, t) = \prod_{i=1}^{m}2\left(x_i – \frac{1}{2}\right)\left(t_i – \frac{1}{2}\right) + \frac{3}{2},$$

but I am unsure if this is of any help. Thanks.

Best Answer

Let's build this kernel up piecewise, using a bunch of properties that allow you to do that. The two forms are about the same for doing that, but let's use the second one.

  • Scaling: If $k$ is a psd kernel, then so is $\gamma k$ for $\gamma \ge 0$.

  • Sum: If $k_1$ and $k_2$ are psd kernels, then $k_1 + k_2$ is psd.

  • Product: If $k_1$ and $k_2$ are psd kernels, then so is $f(x, y) = k_1(x, y) k_2(x, y)$.

(These are proved e.g. in this answer.)

  • Shift: If $k(x, y)$ is a valid kernel, so is $k_\delta(x, y) = k(x + \delta, y + \delta)$ for any constant $\delta$.

    Proof: if $\varphi(x)$ is the feature map for $k$, then $x \mapsto \varphi(x + \delta)$ is the feature map for $k_\delta$.

  • Projection: if $k(x, y)$ is a valid kernel on $\mathcal X$, then $k'(\vec x, \vec y) = k(x_\ell, y_\ell)$ is a valid kernel on $\mathcal X^d$ for any dimension $\ell \in \{1, \dots, d\}$.

    Proof: For any points $\{\vec x_j \}_{j=1}^n$ and corresponding constants $a_i$, $$\sum_{i,j} a_i k'(\vec x_i, \vec x_j) a_j = \sum_{i,j} a_i k(x_{i,\ell}, y_{j,\ell}) \ge 0$$ since $k$ is a psd kernel.

  • Constants: $k(x, y) = \gamma$ is psd for any $\gamma \ge 0$, on any input set.

    Proof: $\sum_{i,j} a_i k(x, y) a_j = \gamma \sum_{i,j} a_i a_j = \gamma \lVert a \rVert^2 \ge 0.$

Now:

  • The linear kernel on $\mathbb R$ is $(x, t) \mapsto x t$ is psd.
  • By shift, so is $(x, t) \mapsto (x - \frac12) (t - \frac12)$.
  • By projection, so is $(\vec x, \vec t) \mapsto (x_i - \frac12) (t_i - \frac12)$ for each $i$.
  • By scaling, so is $(\vec x, \vec t) \mapsto 2 (x_i - \frac12) (t_i - \frac12)$.
  • By repeated application of product, so is $(\vec x, \vec t) \mapsto \prod_{i=1}^m 2 (x_i - \frac12) (t_i - \frac12)$.
  • By constants and sum, so is $(\vec x, \vec t) \mapsto \prod_{i=1}^m 2 (x_i - \frac12) (t_i - \frac12) + \frac32$.
Related Question