One way to see this is to consider the fact that the vector space of rank (3,3) completely antisymmetric tensors ($ \Lambda_3^3(R^3) $) has dimension one (it's just a linear algebra exercise). Then define the tensor:
$$
M_{ijk}^{lmn} = \delta_i^{[l} \, \delta_j^m \, \delta_k^{n]} = \frac{1}{3!} \sum_{\sigma \in S_3} sgn(\sigma) \, \delta_i^{\sigma(l)} \, \delta_j^{\sigma(m)} \, \delta_k^{\sigma(n)}
$$
where we are summing over all the permutations of three numbers $\sigma$, and $sgn(\sigma)$ denotes the sign of the permutation. It's worthy to note that
$$
M_{ijk}^{lmn}=\frac{1}{3!}\begin{vmatrix} \delta_i^l & \delta_i^m & \delta_i^n\\
\delta_j^l & \delta_j^m & \delta_j^n \\ \delta_k^l & \delta_k^m & \delta_k^n \end{vmatrix}
$$
by the Leibniz formula of the determinant (http://en.wikipedia.org/wiki/Leibniz_formula_for_determinants). So we have $M \in \Lambda_3^3(R^3) $. Since $M \neq 0$, $M$ is a basis for the space $ \Lambda_3^3(R^3) $. Now consider the tensor
$$
\epsilon_{ijk} \, \epsilon^{lmn} = B_{ijk}^{lmn}
$$
Since $B \in \Lambda_3^3(R^3) $ and $M$ is a basis, there exists a constant $k$ such that
$$
B_{ijk}^{lmn} = k \, M_{ijk}^{lmn} \implies \epsilon_{ijk} \, \epsilon^{lmn} = k \, \delta_i^{[l} \, \delta_j^m \, \delta_k^{n]}
$$
Now to determine $k$ contract $\epsilon_{lmn}$ on both sides and use the fact that $\epsilon_{lmn} \, \epsilon^{lmn} =3!$ (since you sum $3!$ terms equal to one)
$$
\epsilon_{ijk} \, \epsilon^{lmn} \, \epsilon_{lmn} = 3! \, \epsilon_{ijk} = k \, \delta_i^{[l} \, \delta_j^m \, \delta_k^{n]} \epsilon_{lmn} = k \, \delta_i^{l} \, \delta_j^m \, \delta_k^{n} \, \epsilon_{[lmn]} = k \, \delta_i^{l} \, \delta_j^m \, \delta_k^{n} \, \epsilon_{lmn} = k \, \epsilon_{ijk}
$$
So we finally get $k=3!$ and
$$
\epsilon_{ijk} \, \epsilon^{lmn}=\begin{vmatrix} \delta_i^l & \delta_i^m & \delta_i^n\\
\delta_j^l & \delta_j^m & \delta_j^n \\ \delta_k^l & \delta_k^m & \delta_k^n \end{vmatrix}
$$
And you can get the identity you want by contracting. The same argument could be used in any dimension to show that
$$
\epsilon_{i_1,\dots,i_n} \, \epsilon^{j_1,\dots,j_n} = \begin{vmatrix} \delta_{i_1}^{j_1} & \dots & \delta_{i_1}^{j_n}\\
\vdots & & \vdots \\ \delta_{i_n}^{j_1} & \dots & \delta_{i_n}^{j_n} \end{vmatrix}
$$
Ok. I'm going to put my response as an answer to my question since it involves some new information I've found and want to document here. Part of my confusion has stemmed from the fact that different authors use different notation regarding the transition from Levi-Civita symbols or tensor densities ($\tilde{\epsilon}$) and Levi-Civita tensors ($\epsilon$). Here are the two clearest references I have found on the subject and they illustrate the two possible conventional choices. Sean Carroll's lecture notes on geometry and spacetime ch. 2 and Christopher Pope's Electrodynamics lecture notes. I find Pope's notes to give a little bit more detail which made the difference for me.
The point is the Levi-Civita symbol with the lower indices*, $\tilde{\epsilon}_{ijk}$ is defined as an object which is anti-symmetric in its indices. i.e. $\tilde{\epsilon}_{123}=+1$ whereas $\tilde{\epsilon}_{132}=-1$. Furthermore, (as explained better in the Pope notes) the symbol takes the same value in all coordinate frames. That is if we transform from one set of coordinates $\{x^i\}$ to another set of coordinates $\{x^{'i}\}$ we have transform $\tilde{\epsilon}_{ijk}$ transforms to $\tilde{\epsilon}^{'}_{ijk}$ with the relation that $\tilde{\epsilon}^{'}_{ijk}=\tilde{\epsilon}_{ijk}$
It is then possible to prove, using some facts about determinants, that
$$\tilde{\epsilon}_{ijk}=\tilde{\epsilon}^{'}_{ijk} = \left|\frac{\partial x'}{\partial{x}}\right|\frac{\partial x^a}{\partial x^{'i}}\frac{\partial x^b}{\partial x^{'j}}\frac{\partial x^c}{\partial x^{'k}}\tilde{\epsilon}_{abc}$$
That is to say $\tilde{\epsilon}_{ijk}$ transforms like a tensor density of weight +1. It turns out $\sqrt{|\text{det}(g_{ij})|}=\sqrt{|\text{det }g|}$ is a tensor density of weight -1 (as proven in the Pope notes) so by multiplying these two tensor densities you get a new tensor density of weight 0, i.e. a regular tensor (indicated by the absence of the tilde).
$$\epsilon_{ijk} = \sqrt{|\text{det }g|}\tilde{\epsilon}_{ijk}$$
Both authors agree on this much. And so far none of this would have been changed by choosing the (-+++) metric as opposed to the (+---) metric**. However, the next step is finding the upper index object. Since $\epsilon_{ijk}$ is a tensor we can raise its indices:
$$\epsilon^{ijk} = g^{ii'}g^{jj'}g^{kk'}\epsilon_{i'j'k'} = g^{ii'}g^{jj'}g^{kk'}\tilde{\epsilon}_{i'j'k'}\sqrt{|\text{det }g|} \\
=\text{det}(g^{-1}) \sqrt{|\text{det }g|} \tilde{\epsilon}_{ijk} = \frac{\text{sgn}(g)}{\sqrt{|g|}}\tilde{\epsilon}_{ijk}$$
At this point we have
$$\epsilon^{ijk} = \frac{\text{sgn}(g)}{\sqrt{|\text{det }g|}}\tilde{\epsilon}_{ijk}$$
This is where people make a convention choice. Carroll (and many others that I have seen), for example, makes the chioce that $\tilde{\epsilon}^{ijk} = \tilde{\epsilon}_{ijk}$ so that we get
$$\epsilon^{ijk} = \frac{\text{sgn}(g)}{\sqrt{|\text{det }g|}}\tilde{\epsilon}^{ijk}$$
Pope takes the convention that $\tilde{\epsilon}^{ijk} = \text{sgn}(g)\tilde{\epsilon}_{ijk}$ so that
$$\epsilon^{ijk} = \frac{1}{\sqrt{|\text{det }g|}}\tilde{\epsilon}^{ijk}$$
So we've already made at least two convention chioces. The first is how $\epsilon_{ijk}$ relates to $\tilde{\epsilon}_{ijk}$ and the second is how $\tilde{\epsilon}_{ijk}$ relates to $\tilde{\epsilon}^{ijk}$. I think we have yet another choice now of how to define the cross product. I think the responsible thing to do is to keep in mind all of the machinery up to this point and make a manifestly covariant definition of the cross product. This would look like:
$$(A\times B)^i = \epsilon^{i}{}_{jk}A^jB^k=g^{ii'}\epsilon_{i'jk}A^j B^k = \pm \epsilon_{ijk}A^j B^k=\pm\tilde{\epsilon}_{ijk}A^j B^k$$
Where $\pm$ indicates the next convention choice of mostly positive or mostly negative metric signature. I think this has the disadvantage that depending on whether you take the mostly positive or mostly negative metric you get a relative minus sign in the definition of the cross product, but I think this is actually to be expected since it moves the spatial part from being a right handed to left handed coordinate system. It also has the disadvantage that you really have to remember quite a few places for negative signs. It has the advantage that is covariant so that if you do follow the rules you can do the manipulations more easily I guess.
For example @Solenodon Paradoxus gives a formula for the "correct expression" for the cross product but I'm not sure on what grounds that is the "correct expression". It follows proper Einstein convention but the symbol used is the Levi-Civita symol which is not a tensor so there's not actually a rule saying the expression SHOULD follow the Einstein convention which leaves confusion as to the motivation behind any particular definitions when there are so many choices.
*I believe this is a matter of convention. We could alternatively take the Levi-Civita symbol with upper indices to be the object which is anti-symmetric in its indices.
**but the story would maybe be different if we had chosen the upper indexed Levi-Civita symbol to be the usual anti-symmetric object instead.
edit1: for less awkward notation we could define $(A \times B)^i = \epsilon^{ijk}A_jB_k$ which would be equivalent to the definition given. For my problem I'm working on I'm thinking of contravariant components of vectors as the "physical" quantities so I'm prefering to write expression in terms of those.
Best Answer
Don't use indices. Use a sophisticated and efficient notation for tensor manipulations: through clifford algebra.
Let $\gamma^0, \gamma^1, \gamma^2, \gamma^3$ be basis vectors. Under the clifford product operation, they obey the following:
$$\gamma^\mu \gamma^\nu = \begin{cases}\eta^{\mu \nu} & \mu = \nu \\ -\gamma^\mu \gamma^\nu & \mu \neq \nu\end{cases}$$
The product operation is also associative.
Write $E$ and $B$ in terms of $\gamma^1, \gamma^2, \gamma^3$ as you would write any vectors: $E = E_x \gamma^1 + \ldots$ for instance. Then $F$ can be written
$$F = \gamma^0 E - \gamma^1 \gamma^2 \gamma^3 B$$
Define $\epsilon = -\gamma^0 \gamma^1 \gamma^2 \gamma^3$. Then the tensor equation $\epsilon^{\alpha \beta \gamma \delta} F_{\alpha \beta} F_{\gamma \delta}$ can be written
$$\epsilon^{\alpha \beta \gamma \delta} F_{\alpha \beta} F_{\gamma \delta} = \langle \epsilon FF\rangle_0$$
The brackets denote that we're looking for the scalar part of this expression. This is equivalent to $\epsilon \langle FF \rangle_4 = \epsilon (F \wedge F)$ in the common shorthand.
Now we can use the associativity of the product to our advantage. For brevity, let $i = \gamma^1 \gamma^2 \gamma^3$, such that $\gamma^0 i = \epsilon$, and see that
$$\langle \epsilon FF \rangle_0 = \langle \epsilon (\gamma^0 E - iB)(\gamma^0 E - iB) \rangle_0$$
Multiply out the terms:
$$\langle \epsilon FF \rangle_0 = \langle \epsilon (\gamma^0 E \gamma^0 E -\gamma^0 E iB - iB \gamma^0 E + iBiB )\rangle_0$$
Use the properties of the product to deduce that $E\gamma^0 = -\gamma^0 E$, $\gamma^0 B = -B \gamma^0$, $Ei = iE$ and $iB = Bi$, and $ii = 1$. The result reduces to
$$\langle \epsilon FF \rangle_0 = \langle \epsilon (-EE - 2 \epsilon EB + BB) \rangle_0$$
Finally, deduce that $\epsilon \epsilon = -1$, $EE = -|E|^2$ and $BB = -|B|^2$ to get
$$\langle \epsilon FF \rangle_0 = \langle \epsilon (|E|^2 - |B|^2) + 2 E \cdot B \rangle_0$$
The only term that survives the angled brackets is the $2 E \cdot B$ term. Of course, staring us right in the face is the other well-known invariant, $E^2 - B^2$. That is one advantage of this method: both invariants can be derived from the same approach, and at the same time. Moreover, we never have to break into components to evaluate the expression. We merely use the commutation properties of basis elements in the clifford algebra--an algebra already widely used in quantum mechanics.