It's not misleading as long as you change your notion of equivalence. When a matrix represents a linear transformation $V \to V$, the correct notion of equivalence is similarity: $M \simeq B^{-1} MB$ where $B$ is invertible. When a matrix represents a bilinear form $V \times V \to \mathbb{R}$, the correct notion of equivalence is congruence: $M \simeq B^TMB$ where $B$ is invertible. As long as you keep this distinction in mind, you're fine.
We first note that vectors $c_1\otimes a_1,\ldots,c_R\otimes a_R$ are linearly independent. Otherwise, after reordering if necessary, there would exist scalars $\beta_1,\ldots,\beta_{R-1}$ such that
$$c_R\otimes a_R = \sum_{k=1}^{R-1}\beta_k(c_k\otimes a_k)\,.$$
Using definition and linearity of Kronecker product, this would lead to
\begin{align}
\operatorname{vec}(\mathcal{X})
&= \sum_{k=1}^{R-1}c_k\otimes b_k\otimes a_k + c_R\otimes b_R\otimes a_R\\
&= \sum_{k=1}^{R-1}c_k\otimes (b_k+\beta_kb_R)\otimes a_k\,,
\end{align}
which is in contradiction with $\mathcal{X}$ having CP rank $R$.
Said differently, matrix
$$C\odot A=\begin{bmatrix}c_1\otimes a_1 & \cdots & c_R\otimes a_R\end{bmatrix},$$
where $\odot$ denotes Kathri-Rao product, has rank $R$.
The second matricization of $\mathcal{X}$ can be written as
$$X^{(2)}=B(C\odot A)^T\,.$$
Since $C\odot A$ is of full column rank, it follows that $$R_2=\operatorname{rank}(X^{(2)}) = \operatorname{rank}(B)\,.$$
We can repeat all of the observations above for the first and the third matricization of $\mathcal{X}$ to conclude
$$R = \operatorname{rank}(C\odot B) = \operatorname{rank}(C\odot A) = \operatorname{rank}(B\odot A)$$
and
$$R_1=\operatorname{rank}(A),\quad R_2=\operatorname{rank}(B),\quad R_3=\operatorname{rank}(C)\,.$$
Now, we have
$$R = \operatorname{rank}(C\odot B) \leq \operatorname{rank}(C\otimes B) = \operatorname{rank}(B)\operatorname{rank}(C)=R_2R_3\,,$$
where inequality holds because each column of $C\odot B$ appears as a column of $C\otimes B$.
Repeating this for matrices $C\odot A$ and $B\odot A$ proves the theorem.
Best Answer
Let $V$ be the tensor product you defined and let $W = \mathbb{C}^{n_1} \times \cdots \times \mathbb{C}^{n_d}$ be the analogous Cartesian product.
The set $V_{\leq r}$ of all tensors of rank $\leq r$ is the image of the morphism
$\mathbb{C}^r \times W^r \to V$
taking a tuple of scalars and pure tensors to the corresponding linear combination. By Chevalley's Theorem (https://en.wikipedia.org/wiki/Constructible_set_(topology)), the image is a Zariski constructible set. Finite unions and complements of constructible sets are again constructible, so the set $V_r := V_{\leq r} \setminus V_{\leq r-1}$, of tensors of rank exactly $r$, is constructible.
(This is false over $\mathbb{R}$ as far as real points are concerned! For example, the morphism $\mathbb{A}^1_{\mathbb{R}} \to \mathbb{A}^1_\mathbb{R}$ given by $t \mapsto t^2$. The real points in the image are not a Zariski constructible set.)
Continuing... we can write $V$ as the union of the sets $V_r$ for all $r$. Since $V$ is irreducible, for some $r$ the closure $\overline{V_r}$ is all of $V$. Every constructible set contains a dense open subset of its closure, so this $V_r$ contains a Zariski dense open subset of $V$.
Sadly, as you pointed out, it's not the case that $V_{\leq r}$ is the closure of $V_r$, so the maximal rank need not be the generic rank. But this does show that some rank is generic.
This explains the phenomenon you described -- the other tensors are all, collectively, supported on a set of Lebesgue measure zero. So as you suspected, it doesn't really matter what random process you use to generate tensors, as long as (e.g.) it is absolutely continuous with respect to $N$-dimensional Lebesgue measure.