I'm aware that a gaussian process is equivalent to bayesian linear regression for the kernel $K(x_i,x_j) = x_i x_j$ (assume scalar $x$ here). However, the proof itself didn't lend much intuition to me.
If I imagine sampling a function from the linear GP as sampling one point from an uncountable-dimension gaussian RV with covariance matrix $K$ (I'm aware this isn't mathematically rigorous, but bear with me), it is very unintuitive to me why all the points should lie on a line — why the function should be linear.
All I know about this "covariance matrix" $K = xx^T$ (where $x$ is a vector containing all the real numbers) is that the rank is 1, and that it's symmetric. I should be able to diagonalize it as $K = Q^T\Lambda Q$ with all the eigenvalues on the diagonal of $\Lambda$. Since the rank is 1, it should have one non-zero eigenvalue which I can force to be the top-left entry simply by permuting rows/cols of $Q$ and $\Lambda$. So now I can imagine sampling with covariance $\Lambda$, and then applying rotation $Q^T$.
If I fix $f(0)$ as the "first" dimension of our uncountable gaussian, this means I can sample $f(0)$ from some univariate gaussian, and then $f(x) = 0$ for all other $x$, since all the other entries of $\Lambda$ are 0 and the mean is 0.
This definitely doesn't look like a linear function to me — it looks like a constant function with a discontinuity at 0. Furthermore, I'm not sure how rotation $Q$ affects the function (surely, it doesn't correspond to rotating a plot of the function on the 2D plane).
I think I've gone wrong with the math somewhere, so the question is: is there a way to show that a rank-1 kernel for GP corresponds to linear functions? What about rank-2, does it correspond to quadratic functions?