You can prove it by explicitly calculating the conditional density by brute force, as in Procrastinator's link (+1) in the comments. But, there's also a theorem that says all conditional distributions of a multivariate normal distribution are normal. Therefore, all that's left is to calculate the mean vector and covariance matrix. I remember we derived this in a time series class in college by cleverly defining a third variable and using its properties to derive the result more simply than the brute force solution in the link (as long as you're comfortable with matrix algebra). I'm going from memory but it was something like this:
Let ${\bf x}_{1}$ be the first partition and ${\bf x}_2$ the second. Now define ${\bf z} = {\bf x}_1 + {\bf A} {\bf x}_2 $ where ${\bf A} = -\Sigma_{12} \Sigma^{-1}_{22}$. Now we can write
\begin{align*} {\rm cov}({\bf z}, {\bf x}_2) &= {\rm cov}( {\bf x}_{1}, {\bf x}_2 ) +
{\rm cov}({\bf A}{\bf x}_2, {\bf x}_2) \\
&= \Sigma_{12} + {\bf A} {\rm var}({\bf x}_2) \\
&= \Sigma_{12} - \Sigma_{12} \Sigma^{-1}_{22} \Sigma_{22} \\
&= 0
\end{align*}
Therefore ${\bf z}$ and ${\bf x}_2$ are uncorrelated and, since they are jointly normal, they are independent. Now, clearly $E({\bf z}) = {\boldsymbol \mu}_1 + {\bf A} {\boldsymbol \mu}_2$, therefore it follows that
\begin{align*}
E({\bf x}_1 | {\bf x}_2) &= E( {\bf z} - {\bf A} {\bf x}_2 | {\bf x}_2) \\
& = E({\bf z}|{\bf x}_2) - E({\bf A}{\bf x}_2|{\bf x}_2) \\
& = E({\bf z}) - {\bf A}{\bf x}_2 \\
& = {\boldsymbol \mu}_1 + {\bf A} ({\boldsymbol \mu}_2 - {\bf x}_2) \\
& = {\boldsymbol \mu}_1 + \Sigma_{12} \Sigma^{-1}_{22} ({\bf x}_2- {\boldsymbol \mu}_2)
\end{align*}
which proves the first part. For the covariance matrix, note that
\begin{align*}
{\rm var}({\bf x}_1|{\bf x}_2) &= {\rm var}({\bf z} - {\bf A} {\bf x}_2 | {\bf x}_2) \\
&= {\rm var}({\bf z}|{\bf x}_2) + {\rm var}({\bf A} {\bf x}_2 | {\bf x}_2) - {\bf A}{\rm cov}({\bf z}, -{\bf x}_2) - {\rm cov}({\bf z}, -{\bf x}_2) {\bf A}' \\
&= {\rm var}({\bf z}|{\bf x}_2) \\
&= {\rm var}({\bf z})
\end{align*}
Now we're almost done:
\begin{align*}
{\rm var}({\bf x}_1|{\bf x}_2) = {\rm var}( {\bf z} ) &= {\rm var}( {\bf x}_1 + {\bf A} {\bf x}_2 ) \\
&= {\rm var}( {\bf x}_1 ) + {\bf A} {\rm var}( {\bf x}_2 ) {\bf A}'
+ {\bf A} {\rm cov}({\bf x}_1,{\bf x}_2) + {\rm cov}({\bf x}_2,{\bf x}_1) {\bf A}' \\
&= \Sigma_{11} +\Sigma_{12} \Sigma^{-1}_{22} \Sigma_{22}\Sigma^{-1}_{22}\Sigma_{21}
- 2 \Sigma_{12} \Sigma_{22}^{-1} \Sigma_{21} \\
&= \Sigma_{11} +\Sigma_{12} \Sigma^{-1}_{22}\Sigma_{21}
- 2 \Sigma_{12} \Sigma_{22}^{-1} \Sigma_{21} \\
&= \Sigma_{11} -\Sigma_{12} \Sigma^{-1}_{22}\Sigma_{21}
\end{align*}
which proves the second part.
Note: For those not very familiar with the matrix algebra used here, this is an excellent resource.
Edit: One property used here this is not in the matrix cookbook (good catch @FlyingPig) is property 6 on the wikipedia page about covariance matrices: which is that for two random vectors $\bf x, y$, $${\rm var}({\bf x}+{\bf y}) = {\rm var}({\bf x})+{\rm var}({\bf y}) + {\rm cov}({\bf x},{\bf y}) + {\rm cov}({\bf y},{\bf x})$$ For scalars, of course, ${\rm cov}(X,Y)={\rm cov}(Y,X)$ but for vectors they are different insofar as the matrices are arranged differently.
Best Answer
I don't think there is a positive answer to your question. The beauty of the normal distribution, univariate or multivariate, is that it is easily defined by the cumulants of order higher than two being zero. (The cumulant of order $k$ is the normalized $k$-th derivative of the characteristic function: $\kappa_k = i^{-k} \frac{\partial^k}{\partial t^k} \phi (t) |_{t=0}$.) The CLT states essentially that for all $k$, $\kappa_k = o(1)$ as $n\to\infty$ (and the rate can be established).
However, the property of zero cumulants is very fragile. Once you depart from zero third cumulant, all higher order cumulants have to be non-zero, as well: there is no distribution for which $\kappa_4=0$ if $\kappa_3\neq 0$. So for each value of skewness (which apparently is reflected in the third cumulant), there's a range of reasonable values for other cumulants, and the beauty of the distribution will be in the eyes of the beholder.
In a way, the "closest relative" of the multivariate normal distribution is the skew-normal distribution. Its density is the normal density "filtered" by a normal cdf: $f(x;\Sigma,\alpha) = 2f(x;\mu,\Sigma)\Phi(\alpha'x)$ where $f(x;\Sigma)$ is the density of a multivariate normal with mean zero vector and covariance matrix $\Sigma$, and $\Phi(z)$ is the standard normal cdf. So you don't really have a tensor here, but only a skewing vector. Nicely enough, for $\alpha=0$, you get the special case of a multivariate normal distribution.
Another way to approach the issue is from the point of view of stable laws. A stable law is a distribution such that the linear transformation of random variables having this distribution again has this distribution, possibly scaled and shifted. The normal distribution is an obvious example; and Cauchy is yet another example, although far less obvious. Stable laws are defined implicitly by their characteristic function: $\phi(t) = \exp[ i t \mu - |ct|^\alpha (1-i \beta \, {\rm sign} (t) \Phi]$. Here, $\mu$ is the shift parameter, $c$ is the scale parameter, $\alpha$ is the stability parameter, $\beta$ is the asymmetry parameter, and $\Phi=\tan(\pi\alpha/2)$ for $\alpha\neq1$, and $\Phi=-2/\pi \ln|t|$ for $\alpha=1$. ($\alpha=2, \beta=0$ gives the normal distribution; $\alpha=1, \beta=0$ gives the Cauchy distribution; moments beyond the first one exist only for $\alpha=2$.) Wikipedia provides a bunch of pictures. The multivariate extensions of stable laws do exist, too (thanks to @mpiktas for pointing them out!).