As already noted in the comments, the concepts of cosine similarity and correlation are different. In particular, as explained below, the cosine of the angle between two vectors can be considered equivalent to the correlation coefficient only if the random variables have zero means. This explains why two orthogonal vectors, whose cosine similarity is zero, can show some correlation, and then a covariance different from zero as in the example of the OP.
Cosine similarity is obtained by taking the inner product and dividing it by the vectors’ $L2$ norms. The formula is
$${\displaystyle CS(x,y) ={\frac {\sum \limits _{i=1}^{n}{x_{i}x_{i}}}{{\sqrt {\sum \limits _{i=1}^{n}{x_{i}^{2}}}}{\sqrt {\sum \limits _{i=1}^{n}{y_{i}^{2}}}}}}= {\langle x,y \rangle \over \| x \|\|{y} \|} }$$
and corresponds to the cosine of the angle between the two vectors.
Cosine similarity is bounded between $-1$ and $1$. However, in most applications where this measure is used, the vectors are non-negative, so in these cases it ranges between $0$ and $1$. Importantly, cosine similarity is invariant to scaling (i.e. multiplying all terms by a nonzero constant) but is not invariant to shifts (i.e. adding a constant to all terms).
On the other hand, correlation can be seen as the cosine similarity measured between the centered versions of the two vectors. In fact, indicating with $\overline{x}$ and $\overline{y}$ the means, we have
$${\displaystyle r(x,y) ={\frac {\sum \limits _{i=1}^{n}({x_{i}-\overline{x})(y_{i}- \overline{y} ) }}{{\sqrt {\sum \limits _{i=1}^{n}{ (x_{i}-\overline{x}) ^{2}}}}{\sqrt {\sum \limits _{i=1}^{n}{
(y_{i}-\overline{y})^{2}}}}}}} = {\langle x-\overline{x}, \,y -\overline{y}\rangle \over \| x-\overline{x} \|\|{y}-\overline{y} \|} $$
and then
$$r(x,y)=CS(x-\overline{x}, \,y -\overline{y})$$
It is worthy of note that correlation is bounded between $-1$ and $1$ as well, but unlike cosine similarity it is invariant to both scaling and shifts.
We conclude that the cosine similarity is equal to the correlation coefficient only when the vectors $x$ and $y$ are centered (i.e., they have zero means).
Best Answer
Hint:
$X \cdot Y$ is a random variable.
$\text{Cov}(X,Y)$ is an expected value.
There is some confusion of terminolgy in your post.
If $\bf X , \bf Y$ are two random vectors (in $m$-space) then their dot product is a random variable $$ \begin{array}{l} {\bf X},{\bf Y} \in R^m \\ q = {\bf X} \cdot {\bf Y} = \left\| {\bf X} \right\|\,\left\| {\bf Y} \right\|\;\cos \alpha \quad \left| {\;q \in R} \right. \\ \end{array} $$ which is actually the product of three random variables, among which $\cos \alpha$.
In this case the covariance is defined as a matrix of expected values, which is not what you are considering.
If instead $\bf X , \bf Y$ are two vectors corresponding to the joint sampling (of size $m$) of two random variables $X,Y$ and we are to estimate the correlation between them (which seems what you mean to do) then