According to schmidt decomposition any pure state belonging to a composite system $AB$ can be written as $|\psi\rangle = \sum_i \lambda_i |i_A\rangle |i_B\rangle$ where $\lambda_i$ are non negative real numbers and $|i_A\rangle |$ and $|i_B\rangle$ are orthonormal basis for system $A$ and $B$ respectively. But in an exercise ( exercise 2.77 ) of Nielsen and Chuang it asks to show an example for a composite system $ABC$ where the pure state belonging to it cannot be written as $|\psi\rangle = \sum_i \lambda_i |i_A\rangle |i_B\rangle |i_C\rangle$. If I am not wrong then $\frac{1}{\sqrt{2}}(|000\rangle+|011\rangle)$ is one such example. But is there any physical significance behind it that schmidt decomposition holds for two component composite system only or is it just a mathematical result ? And is the absence of schmidt decomposition for higher component composite systems related to concept of entanglement ?
Quantum Mechanics – Why Schmidt Decomposition Holds Only for Two-Component Composite Systems
quantum mechanicsquantum-information
Related Solutions
Okay, let me elaborate on my comment to show how you would calculate the Schmidt decomposition in general. This might also answer your second question.
As I said in my comment, the Schmidt decomposition requires you to subdivide your system in two parts, A and B. It is then the (unique) decomposition $|\psi \rangle = \sum_\alpha \lambda_\alpha |\psi_{A,\alpha} \rangle \otimes |\psi_{B,\alpha} \rangle $ (where the components define orthonormal bases in A and B). This can be considered the decomposition that minimally entangles the two subsystems (the entanglement being given by the Schmidt values $\lambda_\alpha$). To calculate this decomposition, one rewrites the state as a matrix and then applies the SVD decomposition.
It is in fact simple to rewrite the state as a matrix: one pretends that the wavefunction indices concerning subsystem A are the row indices, and the indices for subsystem B are the column indices. For example, for your state $|\psi\rangle = \frac{1}{2}\left( |00 \rangle +|01\rangle + |10 \rangle + |11\rangle \right)$, we can write this as $|\psi\rangle = \sum_{ij} A_{ij} |ij\rangle$ with $A_{00} = A_{01} = A_{10} = A_{11} = \frac{1}{2}$. We can consider this to define a matrix $$A = \frac{1}{2}\left( \begin{array}{cc} 1 & 1 \\ 1 & 1 \end{array} \right).$$ Now if you look up the definition of the SVD decomposition, it is not hard to see that this then exactly gives us what we want for the Schmidt decomposition. In this case SVD gives us $$A = \frac{1}{\sqrt{2}}\left( \begin{array}{cc} 1 & 1 \\ 1 & -1 \end{array} \right) \; \left( \begin{array}{cc} 1 & 0 \\ 0& 0 \end{array} \right) \; \frac{1}{\sqrt{2}}\left( \begin{array}{cc} 1 & 1 \\ 1 & -1 \end{array} \right)$$ This then exactly gives us the $\lambda_\alpha$ and $|\psi_{A,\alpha}\rangle$ that I wrote down in my comment to your original post.
More generally, if our state is $|\psi\rangle = \sum_{ij} A_{ij} |i_A\rangle \otimes |j_B\rangle$, then the Schmidt decompositions is given by the SVD decomposition as $$A = \left( \begin{array}{cccc} |\psi_{A,0}\rangle & \cdots & |\psi_{A,\alpha} \rangle & \cdots\end{array} \right) \; \left( \begin{array}{cccc} \lambda_0 & 0 & 0 & 0 \\ 0& \ddots & 0 & 0 \\ 0 & 0 & \lambda_\alpha & 0 \\ 0 & 0 & 0 & \ddots \end{array} \right) \; \left( \begin{array}{c} |\psi_{B,0}\rangle \\ \vdots \\ |\psi_{B,\alpha} \rangle \\ \vdots\end{array} \right) $$
The most general measurement which includes a post-measurement state is given by $\rho\mapsto M_i\rho M_i^\dagger$, where in addition, you are allowed to "forget" part of the measurement outcome -- that is, the general post-measurement states will be of the form $$ \rho_J = \sum_{i_J} M_i\rho M_i^\dagger\ , $$ where the $J$ denote disjoint subsets of the index set, with outcome probability $p_J = \mathrm{tr}[\rho_J]$. Your measurement is of that kind.
In some sense, however, this type of measurement is stricly "weaker" then measuring all the $M_i$ individually, in that the former can be classically recovered after performing the latter measurement. Nevertheless, it should indeed be included when describing the most general formalism.
Note that in the "bare" POVM measurement formalism, which assigns probabilities $p_i=\mathrm{tr}[\rho F_i]$ to a state $\rho$ for a measurement described by a positive operator-valued measure (POVM) $F_i$, $\sum F_i=I$, but does not talk of post-measurement states (which often are, in fact, unphysical), does not have this shortcoming, as we can define $F_J = \sum_{i\in J} M_i^\dagger M_i$.
Best Answer
This is a mathematical result. The Schmidt decomposition tells you that there are bases for two parties $A$ and $B$ such that
$$ \sum_{ij} \lambda_{ij} |i_{A}\rangle |j_{B}\rangle = \sum_k \nu_k |\tilde{k}_A\rangle |\tilde{k}_B\rangle $$
with some orthonormal bases $|i_A\rangle,|\tilde{i}_A\rangle, |i_B\rangle, |\tilde{i}_B\rangle$. If you compare the two sides and consider the fact that orthonormal bases are related by a unitary matrix, this will lead you to the singular value decomposition. This means that the Schmidt decomposition is a (rather trivial) corollary to the singular value decomposition.
Now, there is a mathematical result that tells you that this is not possible in higher dimensions (you can remedy this to some degree, as noted in the comments). Sadly, I don't know a nice and intuitive argument of why this is not the case (you could work with Lagrangian multiplies and see that it is not possible, though).
What are the physical consequences? Well, in a sense, this will have consequences almost everywhere where we use the Schmidt decomposition. One very striking example is pure state LOCC-interconvertibility. In other words: Let $|\psi\rangle$ and $|\phi\rangle$ be two bipartite pure states. Can we find a transformations with local operations and classical communication from $|\psi\rangle$ to $|\phi\rangle$? In the bipartite case, we can if and only if the Schmidt coefficients of $|\phi\rangle$ majorize those of $|\psi\rangle$. This was already proven in the last millenium (arXiv).
Having a perfunctory look at the proof, it seems to me that if we had a Schmidt decomposition for arbitrary multipartite systems, essentially the same proof should hold (feel free to confirm this suspicion). This would in particular imply that starting out from one state the "maximally entangled state", we could reach all others, which is known to be wrong for multipartite systems. In any case, if there are physical consequences, this is where I would expect them to be at the very least: State interconversion with LOCC.