Not entirely clear to me from reading the comments if the OP has solved this but there is no answer so I will write one.
The distribution of each $Y_i$ will be normal with given means and variances:
$\mu_0+\mu_1$ and $\sigma_0^2+\sigma^2_1$ for $Y_0$ and
$\mu_1+\mu_2$ and $\sigma_1^2+\sigma^2_2$ for $Y_1$. Now finally we need to
determine if there is a correlation between $Y_0$ and $Y_1$. To do this we can calculate
$$\mathbb{C}ov(Y_0,Y_1)=\mathbb{C}ov(X_0+X_1,X_1+X_2)
=\mathbb{C}ov(X_1,X_1)
=\mathbb{V}ar(X_1)
=\sigma_1^2.
$$
Now you can turn this into a correlation by dividing by the square roots of the variances
$$\rho = \frac{\sigma_1^2}{\sqrt{(\sigma_0^2+\sigma^2_1)(\sigma_1^2+\sigma^2_2)} }.$$
Now we know that the sum of two normal random variables is normally distributed so that both $Y_0$ and $Y_1$ have normal distributions with the stated means and variances and the correlation is given by $\rho$ above. So the joint density of $Y_0, Y_1$ is
$$ f(y_0,y_1) = N\left(\vec{\mu} = \begin{bmatrix}
\mu_0+\mu_1 \\
\mu_1+\mu_2 \\
\end{bmatrix}, \Sigma = \begin{bmatrix}
\sigma^2_0+\sigma^2_1 &\sigma_1^2 \\
\sigma_1^2 & \sigma^2_1+\sigma^2_2 \\
\end{bmatrix} \right).
$$
Define a random variable $C\in\{1,2\}$ with prior distribution $\mu_C$ given by
$$
\mu_C(A) = P\{C\in A\} = \frac{1}{2} I_A(1) + \frac{1}{2} I_A(2) \, ,
$$
where $A$ is any subset of $\{1,2\}$.
Use the notation $X=(X_1,X_2)$ and $x=(x_1,x_2)$. Suppose that
$$X\mid C=1\sim N(\mu_1,\Sigma_1)\, ,$$
$$X\mid C=2\sim N(\mu_2,\Sigma_2)\, ,$$
where $\mu_1=(2, 2)^\top$, $\Sigma_1=\textrm{diag}(2,1)$, $\mu_2=(2,4)^\top$ and $\Sigma_2=\textrm{diag}(4,2)$.
Now, study this
http://en.wikipedia.org/wiki/Multivariate_normal_distribution
to understand that
$$
f_{X\mid C}(x\mid 1) = \frac{1}{2\pi\sqrt{2}} \exp\left(-\frac{1}{2}\left(\frac{(x_1-2)^2}{2} + \frac{(x_2-2)^2}{1} \right)\right) \, ,
$$
$$
f_{X\mid C}(x\mid 2) = \frac{1}{4\pi\sqrt{2}} \exp\left(-\frac{1}{2}\left(\frac{(x_1-2)^2}{4} + \frac{(x_2-4)^2}{2} \right)\right) \, .
$$
Using Bayes Theorem, we have
$$
P\{C=1\mid X=x\} = \frac{\int_{\{1\}} f_{X\mid C}(x\mid c) \,d\mu_C(c)}{\int_{\{1,2\}} f_{X\mid C}(x\mid c)\, d\mu_C(c)} = \frac{\frac{1}{2} f_{X\mid C}(x\mid 1)}{\frac{1}{2} f_{X\mid C}(x\mid 1) + \frac{1}{2} f_{X\mid C}(x\mid 2)} \, .
$$
The idea is to decide for the first classification if
$$
P\{C=1\mid X=x\} = \frac{1}{1+\frac{f_{X\mid C}(x\mid 2)}{f_{X\mid C}(x\mid 1)}} > \frac{1}{2} \, ,
$$
which is equivalent to
$$
\frac{f_{X\mid C}(x\mid 2)}{f_{X\mid C}(x\mid 1)} < 1 \, ,
$$
or
$$
\log f_{X\mid C}(x\mid 2) - \log f_{X\mid C}(x\mid 1) < 0 \, ,
$$
which gives us
$$
\log \frac{1}{2} - \frac{(x_1-2)^2}{8} - \frac{(x_2-2)^2}{4} + \frac{(x_1-2)^2}{4} + \frac{(x_2-4)^2}{2} < 0 \, . \qquad (*)
$$
Therefore, you decide that the point $x$ belongs to classification $1$ if it is inside the ellipse defined by
$$
\frac{(x_1-2)^2}{8(2+\log 2)} + \frac{(x_2-6)^2}{4(2+\log 2)} = 1 \, ,
$$
otherwise, you decide for classification $2$.
Best Answer
You can't compute joint distribution from marginals. Check this thread for much simper case with computing joint probability from individual probabilities.
In case of normal distributions, as in your question, imagine that you have two marginal distributions, each normal. Say that you are in lucky situation, that you know in advance that their joint distribution is bivariate normal, know the means and variances and the only thing that is unknown to you is the correlation parameter $\rho$. Unfortunately, if you know only marginals then you can't say anything about $\rho$, since there can be any correlation between the two variables, so there is unlimited number of bivariate normal distributions that can lead to such marginals. Moreover, it can be even more extreme and the variables can be marginally normal, but jointly not.