If all covariance functions give you a singular matrix, it could be that some of your data points are identical, which gives two identical rows/columns in the matrix. To regularise the matrix, just add a ridge on the principal diagonal (as in ridge regression), which is used in Gaussian process regression as a noise term.
Note that using a composition of covariance functions or an additive combination can lead to over-fitting the marginal likelihood in evidence based model selection due to the increased number of hyper-parameters, and so can give worse results than a more basic covariance function, even though the basic covariance function is less suitable for modelling the data.
Here is a geometric interpretation.
First, take two vectors in $\mathbb{R}^2$
$$\vec{\mathbb{z}}=[x,y] \,, \vec{\mathbb{w}}=[u,v]$$
For these vectors, there are two standard types of "products", the dot product
$$\vec{\mathbb{z}}\dot{}\vec{\mathbb{w}}=xu+yv$$
and the cross product*
$$\vec{\mathbb{z}}\times\vec{\mathbb{w}}=xv-yu$$
which can be interpreted as
$$\vec{\mathbb{z}}\dot{}\vec{\mathbb{w}}_\perp$$
where $\vec{\mathbb{w}}_\perp=[v,-u]$ has the same magnitude as $\vec{\mathbb{w}}$ but is orthogonal.
(*Technically this "2D cross product" is defined as $[0,0,\vec{\mathbb{z}}\times\vec{\mathbb{w}}]\equiv[x,y,0]\times[u,v,0]$.)
In terms of geometric intuition, the dot product between two vectors measures how well they align (think correlation), but also their relative magnitudes (think standard deviations), i.e.
$$\vec{\mathbb{z}}\dot{}\vec{\mathbb{w}}=||\vec{\mathbb{z}}||\,||\vec{\mathbb{w}}||\,\cos[\theta]$$
where $\theta$ is the angle between them (compare to $\sigma_{xy}=\sigma_x\sigma_y\rho_{xy}$).
Note that the dot product can also be written as $\mathbb{z}^T\mathbb{w}$, where $\mathbb{z}$ and $\mathbb{w}$ are just $\vec{\mathbb{z}}$ and $\vec{\mathbb{z}}$ written as column vectors.
Now let us do the same thing with two scalars in the complex plane ($z,w\in\mathbb{C}$), i.e.
$$z=x+iy \,, w=u+iv$$
What is the equivalent to the "dot product" here? It is actually the same as above, but now using the conjugate transpose, i.e. $z^*\equiv\bar{z}^T$ (also written as $z^\dagger$).
Since the transpose of a scalar is just that same scalar, the complex dot product is then
$$z^{\dagger}w=\bar{z}w=(x-iy)(u+iv)=(xu+yv)+i(xv-yu)$$
We can immediately notice two things. First, the complex dot product is equivalent to
$$z^{\dagger}w=(\vec{\mathbb{z}}\dot{}\vec{\mathbb{w}})+i(\vec{\mathbb{z}}\dot{}\vec{\mathbb{w}}_\perp)$$
i.e. it is a complex number whose real component is the dot product of the corresponding 2-vectors, and whose imaginary component is their cross product. Second, since $\bar{x}=x$ for $x\in\mathbb{R}$, the vector dot product we started with can be written as $\vec{\mathbb{z}}\dot{}\vec{\mathbb{w}}=\mathbb{z}^{\dagger}\mathbb{w}$ (i.e. we were really using the conjugate transpose all along).
Now for the covariance matrix.
For simplicity, let us assume that all random variables have zero mean. Then the covariance is defined as
$$\mathrm{Cov}[z,w]\equiv\mathbb{E}[\bar{z}w]$$
so we have
\begin{align}
\mathrm{Cov}[z,w] &= \mathrm{Re}[\sigma_{z,w}]+i\,\mathrm{Im}[\sigma_{z,w}] \\
&= \,\mathbb{E}[\,\vec{\mathbb{z}}\dot{}\vec{\mathbb{w}}\,]+i\,\mathbb{E}[\,\vec{\mathbb{z}}\dot{}\vec{\mathbb{w}}_\perp]
\end{align}
The real (imaginary) component of $\sigma_{z,w}$ is the expected value of the dot (cross) product of the associated vectors $\vec{\mathbb{z}}$ and $\vec{\mathbb{w}}$.
This is the main intuition. The rest of this answer is just for completeness.
If our random variable is a column vector $\mathbb{z}=[z_1,\ldots,z_n]\in\mathbb{C}^n$ with covariance matrix $\boldsymbol{\Sigma}\in\mathbb{C}^{n\times n}$, then we have
$$\Sigma_{ij}=\mathrm{Cov}[z_i,z_j]$$
Finally, if we have $m$ samples of the random variable $\mathbb{z}$, arranged as the rows of a data matrix $\boldsymbol{Z}\in\mathbb{C}^{m\times n}$, then the sample* covariance can be approximated by
$$\boldsymbol{\Sigma}\approx\tfrac{1}{m}\boldsymbol{Z}^{\dagger}\boldsymbol{Z}$$
(*yes, I divided by $m$, so you can call it the "biased" sample covariance if you must.)
Best Answer
A covariance matrix with zero determinant means that the random variables are perfectly correlated. If your $X$ and $X^*$ are vectors, one is an affine function of the other: $X = AX^* + B$ where $A$ is some matrix and $B$ a vector. If they are random variables, $X = aX^*+b$ where $a$ and $b$ are constants. Is there any reason to suspect that this might be happening in those cases where you are getting a zero determinant for the covariance matrix?