Ok. I'm going to put my response as an answer to my question since it involves some new information I've found and want to document here. Part of my confusion has stemmed from the fact that different authors use different notation regarding the transition from Levi-Civita symbols or tensor densities ($\tilde{\epsilon}$) and Levi-Civita tensors ($\epsilon$). Here are the two clearest references I have found on the subject and they illustrate the two possible conventional choices. Sean Carroll's lecture notes on geometry and spacetime ch. 2 and Christopher Pope's Electrodynamics lecture notes. I find Pope's notes to give a little bit more detail which made the difference for me.
The point is the Levi-Civita symbol with the lower indices*, $\tilde{\epsilon}_{ijk}$ is defined as an object which is anti-symmetric in its indices. i.e. $\tilde{\epsilon}_{123}=+1$ whereas $\tilde{\epsilon}_{132}=-1$. Furthermore, (as explained better in the Pope notes) the symbol takes the same value in all coordinate frames. That is if we transform from one set of coordinates $\{x^i\}$ to another set of coordinates $\{x^{'i}\}$ we have transform $\tilde{\epsilon}_{ijk}$ transforms to $\tilde{\epsilon}^{'}_{ijk}$ with the relation that $\tilde{\epsilon}^{'}_{ijk}=\tilde{\epsilon}_{ijk}$
It is then possible to prove, using some facts about determinants, that
$$\tilde{\epsilon}_{ijk}=\tilde{\epsilon}^{'}_{ijk} = \left|\frac{\partial x'}{\partial{x}}\right|\frac{\partial x^a}{\partial x^{'i}}\frac{\partial x^b}{\partial x^{'j}}\frac{\partial x^c}{\partial x^{'k}}\tilde{\epsilon}_{abc}$$
That is to say $\tilde{\epsilon}_{ijk}$ transforms like a tensor density of weight +1. It turns out $\sqrt{|\text{det}(g_{ij})|}=\sqrt{|\text{det }g|}$ is a tensor density of weight -1 (as proven in the Pope notes) so by multiplying these two tensor densities you get a new tensor density of weight 0, i.e. a regular tensor (indicated by the absence of the tilde).
$$\epsilon_{ijk} = \sqrt{|\text{det }g|}\tilde{\epsilon}_{ijk}$$
Both authors agree on this much. And so far none of this would have been changed by choosing the (-+++) metric as opposed to the (+---) metric**. However, the next step is finding the upper index object. Since $\epsilon_{ijk}$ is a tensor we can raise its indices:
$$\epsilon^{ijk} = g^{ii'}g^{jj'}g^{kk'}\epsilon_{i'j'k'} = g^{ii'}g^{jj'}g^{kk'}\tilde{\epsilon}_{i'j'k'}\sqrt{|\text{det }g|} \\
=\text{det}(g^{-1}) \sqrt{|\text{det }g|} \tilde{\epsilon}_{ijk} = \frac{\text{sgn}(g)}{\sqrt{|g|}}\tilde{\epsilon}_{ijk}$$
At this point we have
$$\epsilon^{ijk} = \frac{\text{sgn}(g)}{\sqrt{|\text{det }g|}}\tilde{\epsilon}_{ijk}$$
This is where people make a convention choice. Carroll (and many others that I have seen), for example, makes the chioce that $\tilde{\epsilon}^{ijk} = \tilde{\epsilon}_{ijk}$ so that we get
$$\epsilon^{ijk} = \frac{\text{sgn}(g)}{\sqrt{|\text{det }g|}}\tilde{\epsilon}^{ijk}$$
Pope takes the convention that $\tilde{\epsilon}^{ijk} = \text{sgn}(g)\tilde{\epsilon}_{ijk}$ so that
$$\epsilon^{ijk} = \frac{1}{\sqrt{|\text{det }g|}}\tilde{\epsilon}^{ijk}$$
So we've already made at least two convention chioces. The first is how $\epsilon_{ijk}$ relates to $\tilde{\epsilon}_{ijk}$ and the second is how $\tilde{\epsilon}_{ijk}$ relates to $\tilde{\epsilon}^{ijk}$. I think we have yet another choice now of how to define the cross product. I think the responsible thing to do is to keep in mind all of the machinery up to this point and make a manifestly covariant definition of the cross product. This would look like:
$$(A\times B)^i = \epsilon^{i}{}_{jk}A^jB^k=g^{ii'}\epsilon_{i'jk}A^j B^k = \pm \epsilon_{ijk}A^j B^k=\pm\tilde{\epsilon}_{ijk}A^j B^k$$
Where $\pm$ indicates the next convention choice of mostly positive or mostly negative metric signature. I think this has the disadvantage that depending on whether you take the mostly positive or mostly negative metric you get a relative minus sign in the definition of the cross product, but I think this is actually to be expected since it moves the spatial part from being a right handed to left handed coordinate system. It also has the disadvantage that you really have to remember quite a few places for negative signs. It has the advantage that is covariant so that if you do follow the rules you can do the manipulations more easily I guess.
For example @Solenodon Paradoxus gives a formula for the "correct expression" for the cross product but I'm not sure on what grounds that is the "correct expression". It follows proper Einstein convention but the symbol used is the Levi-Civita symol which is not a tensor so there's not actually a rule saying the expression SHOULD follow the Einstein convention which leaves confusion as to the motivation behind any particular definitions when there are so many choices.
*I believe this is a matter of convention. We could alternatively take the Levi-Civita symbol with upper indices to be the object which is anti-symmetric in its indices.
**but the story would maybe be different if we had chosen the upper indexed Levi-Civita symbol to be the usual anti-symmetric object instead.
edit1: for less awkward notation we could define $(A \times B)^i = \epsilon^{ijk}A_jB_k$ which would be equivalent to the definition given. For my problem I'm working on I'm thinking of contravariant components of vectors as the "physical" quantities so I'm prefering to write expression in terms of those.
Following the method in L&L vol. 2 ch. 1, the left-hand side
$$\varepsilon^{\alpha \beta \mu \nu} \varepsilon_{\alpha \beta \rho \sigma}$$
is a product of a pseudo-tensor, which is invariant under Lorentz transformations up to a factor of a determinant, and a pseudo-tensor which transforms under the inverse determinant.
This tells us that the right-hand side has to be an invariant tensor and so must be constructed from Kronecker delta's.
The right hand side must also be anti-symmetric in $\mu,\nu$ and anti-symmetric in $\rho,\sigma$ so that on general grounds
$$
\varepsilon^{\alpha \beta \mu \nu} \varepsilon_{\alpha \beta \rho \sigma} = - A (\delta^{\mu}_{\rho} \delta^{\nu}_{\sigma} - \delta^{\mu}_{\sigma} \delta^{\nu}_{\rho})$$
must hold, where the minus sign is due to the Minkowski metric, and the factor $A=2$ is fixed by the requirement that $\varepsilon^{\mu \nu \rho \sigma} \varepsilon_{\mu \nu \rho \sigma} = - 4!$ holds, so setting $ - A (\delta^{\mu}_{\mu} \delta^{\nu}_{\nu} - \delta^{\mu}_{\nu} \delta^{\nu}_{\mu}) = - 4!$ gives $A(16 - 4) = 24$ or $A = 2$.
More generally, the same thinking tells us that the product $\varepsilon^{\alpha \beta \mu \nu} \varepsilon_{\alpha \beta \rho \sigma}$ must be an anti-symmetric combination of Kronecker delta's up to an overall normalization factor, where the normalization factor must respect the fact that $\varepsilon^{\mu \nu \rho \sigma} \varepsilon_{\mu \nu \rho \sigma} = - 4!$ should hold, but $\delta^{\mu \nu \rho \sigma}_{\alpha \beta \gamma \delta}$ is an anti-symmetric combination of Kronecker delta's such that $\delta^{\mu \nu \rho \sigma}_{\mu \nu \rho \sigma} = 4!$, and so
$$
\varepsilon^{\alpha \beta \mu \nu} \varepsilon_{\alpha \beta \rho \sigma} = - \delta^{\mu \nu \rho \sigma}_{\alpha \beta \rho \sigma}$$
must hold. Thus we can work out contractions like $\varepsilon^{\alpha \beta \mu \nu} \varepsilon_{\alpha \beta \rho \sigma} = - \delta^{\alpha \beta \mu \nu}_{\alpha \beta \rho \sigma}$ directly
\begin{align*}
\varepsilon^{\alpha \beta \mu \nu} \varepsilon_{\alpha \beta \rho \sigma} &= - \delta^{\alpha \beta \mu \nu}_{\alpha \beta \rho \sigma} \\
&= - (\delta^{\alpha}_{\alpha} \delta^{\beta \mu \nu}_{\beta \rho \sigma} - \delta^{\alpha}_{\beta} \delta^{\beta \mu \nu}_{\alpha \rho \sigma} + \delta^{\alpha}_{\rho} \delta^{\beta \mu \nu}_{\alpha \beta \sigma} - \delta^{\alpha}_{\sigma}\delta^{\beta \mu \nu}_{\alpha \beta \rho}) \\
&= - (4\delta^{\beta \mu \nu}_{\beta \rho \sigma} - \delta^{\alpha \mu \nu}_{\alpha \rho \sigma} + \delta^{\beta \mu \nu}_{\rho \beta \sigma} - \delta^{\beta \mu \nu}_{\sigma \beta \rho}) \\
&= - [4(\delta^{\beta}_{\beta} \delta^{\mu \nu}_{\rho \sigma} - \delta^{\beta}_{\rho} \delta^{\mu \nu}_{\beta \sigma} + \delta^{\beta}_{\sigma}\delta^{\mu \nu}_{\beta \rho}) - ( \delta^{\alpha }_{\alpha} \delta^{\mu \nu}_{\rho \sigma} - \delta^{\alpha}_{\rho} \delta^{\mu \nu}_{\alpha \sigma} + \delta^{\alpha}_{\sigma} \delta^{\mu \nu}_{\alpha \rho}) - \delta^{\beta \mu \nu}_{\beta \rho \sigma} + \delta^{\beta \mu \nu}_{\beta \sigma \rho}] \\
&= - [4(4\delta^{\mu \nu}_{\rho \sigma} - \delta^{\mu \nu}_{\rho \sigma} + \delta^{\mu \nu}_{\sigma \rho}) - ( 4 \delta^{\mu \nu}_{\rho \sigma} - \delta^{\mu \nu}_{\rho \sigma} + \delta^{\mu \nu}_{\sigma \rho}) - \delta^{\beta \mu \nu}_{\beta \rho \sigma} + \delta^{\beta \mu \nu}_{\beta \sigma \rho}] \\
&= - [4(2\delta^{\mu \nu}_{\rho \sigma} ) - 2 \delta^{\mu \nu}_{\rho \sigma} - 2 \delta^{\mu \nu}_{\rho \sigma} + 2 \delta^{\mu \nu}_{\sigma \rho}] \\
&= - [8 \delta^{\mu \nu}_{\rho \sigma} - 6 \delta^{\mu \nu}_{\rho \sigma} ] \\
&= - 2 \delta^{\mu \nu}_{\rho \sigma} \\
&= - 2 (\delta^{\mu}_{\rho} \delta^{\nu}_{\sigma} - \delta^{\mu}_{\sigma} \delta^{\nu}_{\rho}).
\end{align*}
Similarly, in $d$ dimensions we have
$$\varepsilon^{\mu_1 .. \mu_d} \varepsilon_{\mu_1 .. \mu_d} = - d!$$
From this one immediately sees that
$$
\varepsilon^{\mu_1 .. \mu_d} \varepsilon_{\nu_1 .. \mu_d} = - \delta^{\mu_1 .. \mu_d}_{\nu_1 .. \nu_d}
$$
and so contractions obey identities like
$$\varepsilon^{\mu_1 \ldots \mu_r \mu_{r+1} .. \mu_d} \varepsilon_{\mu_1 .. \mu_r \nu_{r+1} \ldots \nu_d} = - A \delta^{\mu_{r+1} .. \mu_d}_{\nu_{r+1} .. \nu_d}$$
must hold in $d$ dimensions, where $A$ can be fixed by expanding
$$\varepsilon^{\mu_1 \ldots \mu_r \mu_{r+1} .. \mu_d} \varepsilon_{\mu_1 .. \mu_r \nu_{r+1} \ldots \nu_d} = - \delta^{\mu_1 .. \mu_r \mu_{r+1} .. \mu_d}_{\mu_1 .. \mu_r \nu_{r+1} .. \nu_d}$$
one step at a time as in the example above, and obviously $A = r!$ should hold so that $A = - d!$ when $r = d$, as can be proven by induction.
Best Answer
The expression $\epsilon_{ijk} \epsilon_{ijn}$ is only nonzero when $i \neq j \neq k$ and $i \neq j \neq n$, so it is only nonzero if $k = n$, so it is proportional to $\delta_{kn}$. To figure out the constant of proportionality, set $k = n = 3$. We want to evaluate $$\epsilon_{ij3} \epsilon_{ij3}.$$ The only nonzero terms are when $(i, j) = (1, 2)$ and $(i, j) = (2, 1)$, giving contributions of $1^2$ and $(-1)^2$ respectively. Then the constant is $2$.