This is a mathematical result. The Schmidt decomposition tells you that there are bases for two parties $A$ and $B$ such that
$$ \sum_{ij} \lambda_{ij} |i_{A}\rangle |j_{B}\rangle = \sum_k \nu_k |\tilde{k}_A\rangle |\tilde{k}_B\rangle $$
with some orthonormal bases $|i_A\rangle,|\tilde{i}_A\rangle, |i_B\rangle, |\tilde{i}_B\rangle$. If you compare the two sides and consider the fact that orthonormal bases are related by a unitary matrix, this will lead you to the singular value decomposition. This means that the Schmidt decomposition is a (rather trivial) corollary to the singular value decomposition.
Now, there is a mathematical result that tells you that this is not possible in higher dimensions (you can remedy this to some degree, as noted in the comments). Sadly, I don't know a nice and intuitive argument of why this is not the case (you could work with Lagrangian multiplies and see that it is not possible, though).
What are the physical consequences? Well, in a sense, this will have consequences almost everywhere where we use the Schmidt decomposition. One very striking example is pure state LOCC-interconvertibility. In other words: Let $|\psi\rangle$ and $|\phi\rangle$ be two bipartite pure states. Can we find a transformations with local operations and classical communication from $|\psi\rangle$ to $|\phi\rangle$? In the bipartite case, we can if and only if the Schmidt coefficients of $|\phi\rangle$ majorize those of $|\psi\rangle$. This was already proven in the last millenium (arXiv).
Having a perfunctory look at the proof, it seems to me that if we had a Schmidt decomposition for arbitrary multipartite systems, essentially the same proof should hold (feel free to confirm this suspicion). This would in particular imply that starting out from one state the "maximally entangled state", we could reach all others, which is known to be wrong for multipartite systems. In any case, if there are physical consequences, this is where I would expect them to be at the very least: State interconversion with LOCC.
Concretely, given an object/tensor with any number of indices, you can choose any bipartition of the indices and perform the corresponding SVD.
More precisely, this means to think of the object as an operator acting between two linear spaces, and perform its SVD as you would normally.
For example, your $A$ is an operator acting in a bipartite space, and therefore has $4$ indices. Write these as $A=\sum_{ijk\ell}A_{ijk\ell} |ij\rangle\!\langle k\ell|$. The "standard SVD" corresponds to thinking of this object as the linear operator
$$|k\ell\rangle\mapsto \sum_{ij} A_{ijk\ell} |ij\rangle,$$
and doing the SVD of the corresponding matrix. However, you could also consider the linear operator
$$|j\ell\rangle\mapsto \sum_{ik} A_{ijk\ell} |ik\rangle.$$
If you perform the SVD of the corresponding matrix (which is obtained from the other one transposing some layers), you get a decomposition of the form
$$A_{ijk\ell} = \sum_m s_m u^m_{ik} \,\bar v^m_{j\ell},$$
for some $s_m\ge0$ and collections of orthonormal vectors $\{u^m\}_m,\{v^m\}_m$. If you use this decomposition in the original formula for $A$, you get
$$A = \sum_{ijk\ell} \sum_m s_m u^m_{ik}\,\bar v^m_{j\ell} |ij\rangle\!\langle k\ell|
= \sum_m s_m B_m\otimes C_m,$$
where
$$B_m\equiv \sum_{ik} u^m_{ik} |i\rangle\!\langle k|,
\qquad
C_m\equiv \sum_{ik} \bar v^m_{j\ell} |j\rangle\!\langle \ell|.$$
Note that the orthonormality of the vectors $u^m,v^m$, given by the SVD, translate into the orthonormality of the operators $B_m,C_m$ with respect to the Hilbert-Schmidt inner product:
$$\operatorname{Tr}(B_m^\dagger B_n)=\operatorname{Tr}(C^\dagger_m C_n)=\delta_{mn}.$$
Now, if you want to do this operator decomposition in practice, given an operator expressed as a matrix, you need to "reshape" the matrix to obtain a four-index tensor object, then convert this into a new matrix, and compute the SVD of that. If you have access to Mathematica, here's a quick snippet to take a matrix and return the corresponding operator Schmidt decomposition:
reshapeAndComputeSVD[matrix_, dims_List] := ArrayReshape[matrix, dims] // Flatten[#,
{{1, 3}, {2, 4}}
] & // SingularValueDecomposition;
visualizeOperatorSchmidtDecomposition[matrix_, dims_List] := reshapeAndComputeSVD[matrix, dims] // Thread[
{Transpose @ #[[1]], Diagonal @ #[[2]], Transpose @ #[[3]]}
]& // Map[
#[[2]] * Inactive[CircleTimes][
ArrayReshape[#[[1]], dims[[{1, 3}]]] // MatrixForm,
ArrayReshape[#[[3]], dims[[{2, 4}]]] // MatrixForm
]
&] // Total;
and here's an example of what you get in a few simple cases:
Best Answer
Given $\lvert\psi\rangle = \sum_{ij} A_{ij} \lvert i,j\rangle$, you want to write $A$ in a singular value decomposition, $$ A_{ij} = \sum_k U_{ik}\sigma_k V^*_{jk}\ , $$ with isometries $U$ and $V$. Then, \begin{align} \lvert\psi\rangle &= \sum_{ij} A_{ij} \lvert i,j\rangle \\ & = \sum_{k} \sigma_k \left(\sum_ i U_{ik} \lvert i\rangle\right)\left(\sum_j V^*_{jk}\lvert j\rangle\right) \\ & = \sum_k \sigma_k \lvert\alpha_k\rangle\lvert \beta_k\rangle\ , \end{align} where $$\lvert \alpha_k\rangle := \sum_ i U_{ik} \lvert i\rangle$$ and $$\lvert \beta_k\rangle := \sum_ i V^*_{jk} \lvert j\rangle$$ are orthonormal bases, since $U$ and $V$ are isometries.
$\lvert\psi\rangle = \sum_k \sigma_k \lvert\alpha_k\rangle\lvert \beta_k\rangle$ is thus the Schmidt decomposition.
Of course, if $A$ is unitarily diagonalizable (i.e., it is a normal matrix), then the singular value decomposition coincides with the eigenvalue decomposition, and you can use the latter. But in general, the transformation in the eigenvalue decomposition is not unitary, and will thus not give rise to an orthonormal basis $\lvert\alpha_k\rangle$.