First of all let's define dot product and cross product between two 3-vectors $$\mathbf{a} = \begin{pmatrix} a_1 \\ a_2 \\ a_3 \end{pmatrix} \qquad \text{and} \qquad \mathbf{b} = \begin{pmatrix} b_1 \\ b_2 \\ b_3 \end{pmatrix} $$
dot product:
$$ \mathbf{a}\cdot\mathbf{b} = \sum_i a_i b_i = a_1 b_1 + a_2 b_2+ a_3b_3 $$
cross product: $$ \mathbf{a}\times\mathbf{b} = \begin{pmatrix} a_2 b_3 - a_3 b_2 \\a_3 b_1 - a_1 b_3 \\a_1 b_2 - a_2 b_1 \end{pmatrix} $$
Note that these definitions do not involve geometric quantities like the angle between the two vectors; indeed, it is the angle that is defined in terms of the dot product (for the records, $\cos \theta := \mathbf{a\cdot b}/ \sqrt{\mathbf{(a\cdot a)(b\cdot b)}}$).
Then you have the definition of divergence and curl acting on a function $\mathbf{f}(\mathbf{x}) \equiv \begin{pmatrix}f_1(\mathbf{x}), f_2(\mathbf{x}), f_3(\mathbf{x})\end{pmatrix}$ ($\mathbf{x} = (x_1,x_2,x_3)$; you can call $x_1=x$, $x_2=y$ and $x_3=z$ but my choice allow a compact notation):
divergence:
$$
\mathrm{div}\, \mathbf{f} := \frac{\partial }{\partial x_1} f_1+\frac{\partial }{\partial x_2} f_2+\frac{\partial }{\partial x_3} f_3 = \sum_i \frac{\partial }{\partial x_i}f_i \equiv \sum_i {\partial_i}f_i
$$
where $\partial_i \equiv \partial / \partial x_i$.
curl:
$$
\mathrm{curl} \,\mathbf{f} := \begin{pmatrix}
\partial_2 f_3 - {\partial_3 f_2} \\
\partial_3 f_1 - \partial_1 f_3 \\
\partial_1 f_2 - \partial_2 F_1\end{pmatrix}
$$
Now you can see that if you introduce the quantity $$ \nabla = \begin{pmatrix} \partial_1 \\ \partial_2 \\ \partial_3 \end{pmatrix} $$
you can write the operations of divergence and curl as if $\nabla$ was a vector! Indeed if you apply the definition of dot and cross product you can easily find out that
$$
\nabla \cdot \mathbf{f} = \mathrm{div}\, \mathbf{f} \qquad \text{and} \qquad
\nabla \times \mathbf{f} = \mathrm{curl}\, \mathbf{f}
$$
You can find out that many identities holding for 3-vectors still hold id one of them is $\nabla$.
But note that this "trick" of thinking to $\nabla$ as a 3-vector is formal and not all identities holding for usual 3-vectors keep working.
The cross product of two vectors is really a bivector. It has a magnitude and a direction, but the magnitude is an area instead of a length, and the direction is a plane instead of a line.
Like two vectors can point in opposite directions while lying on the same line, two bivectors can "point" in opposite directions while lying in the same plane. You can think of the directions as clockwise and counterclockwise, though which of those is which depends on which side of the plane you're on.
Bivectors are useful for things that lie in a plane and have a clockwise/counterclockwise direction and a magnitude, like angular velocity.
In three dimensions (and only in three dimensions), you can identify a bivector with a vector perpendicular to the plane of the bivector, whose length is the bivector's area. Because of this, bivectors are usually not taught as such. Instead, you have a cross product that produces another vector, whose direction is given by the right hand rule.
Best Answer
The divergence of a vector field is not a genuine dot product, and the curl of a vector field is not a genuine cross product.
$\nabla \cdot \vec A$ is just a suggestive notation which is designed to help you remember how to calculate the divergence of the vector field $\vec A$. The notation is nice, because it looks like a dot product, but as you say $\nabla$ is not actually a vector.
If it helps, you can use the alternate notation $$\operatorname{div}(\vec A) = \partial_x A_x + \partial_y A_y + \partial_z A_z$$
which makes it easier to see that $\operatorname{div}(\bullet)$ is just an operator which eats a vector field and spits out a scalar field. Curl can be defined similarly, though it's a pain to write out in its entirety.
In Cartesian coordinates, the dot and cross products look like this: $$\vec A \cdot \vec B = \sum_i A_i B_i$$ $$\left[\vec A \times \vec B\right]_i = \sum_{j,k}\epsilon_{ijk}A_jB_k$$
while the divergence and curl operations look like this: $$\operatorname{div}(\vec B) = \sum_i \partial_i B_i$$ $$\big[\operatorname{curl}(\vec B)\big]_i = \sum_{j,k}\epsilon_{ijk} \partial_j B_k$$
The striking similarity leads one to define the vector operator $\nabla$, whose components are just the partial derivatives ($\nabla_i = \partial_i$). However, as pointed out in the comments, this similarity does not generally hold up if you switch to a new coordinate system.
For example, in cylindrical coordinates $(\rho,\phi,z)$, the dot product of two vectors becomes $$\vec A \cdot \vec B = A_\rho B_\rho + A_\phi B_\phi + A_z B_z$$ just like before, but the divergence looks like this:
$$\operatorname{div}(\vec B) = \frac{1}{\rho}\partial_\rho(\rho B_\rho) + \frac{1}{\rho}\partial_\phi B_\phi + \partial_z B_z$$