Ok, what you need to do is compute the joint posterior up to a constant, i.e. $f(y_1,...,y_n|\mu,\tau)p(\mu,\tau)$. Then to compute the conditional posterior $\pi(\mu|\mathbf{y},\tau)$ you just treat the $\tau$ terms as fixed and known, so that some of them can be cancelled out. Then you do the same thing with $\mu$ to get $\pi(\tau|\mathbf{y},\mu)$.
So what I mean is, the joint posterior is given by:
\begin{equation}
\pi(\mu,\tau|\mathbf{y}) \propto \tau^{\frac{n}{2}+\alpha_0-1} \exp\{-\frac{\tau}{2} \sum_{i=1}^{n}(y_i-\mu)^2-\frac{\tau_0}{2}(\mu-\mu_0)^2-\beta_0\tau\}.
\end{equation}
Now, to get the conditional posterior $\pi(\tau|\mathbf{y},\mu)$ you just remove the $\mu$ terms that just multiply the expression. So we could re-write the joint posterior as:
\begin{equation}
\pi(\mu,\tau|\mathbf{y}) \propto \tau^{\frac{n}{2}+\alpha_0-1} \exp\{-\frac{\tau_0}{2}(\mu-\mu_0)^2\}\exp\{-\frac{\tau}{2} \sum_{i=1}^{n}(y_i-\mu)^2-\beta_0\tau\}.
\end{equation}
Now, since we are treating $\mu$ as fixed and known, for the conditonal posterior $\pi(\tau|\mu,\mathbf{y})$ we can remove the first exponential term from the equation and it will still be true (owing to the $\propto$ sign rather than the equals sign). It's important that you get this: the conditional posterior says given that we know $\mu$ what is $\tau$, so $\mu$ is known (and hence fixed). So the kernel for the conditional posterior for $\tau$ becomes (after some re-arranging):
\begin{equation}
\pi(\tau|\mu,\mathbf{y}) \propto \tau^{\frac{n}{2}+\alpha_0-1}\exp\{-\tau(\frac{1}{2} \sum_{i=1}^{n}(y_i-\mu)^2+\beta_0)\},
\end{equation}
which is the kernel for the Gamma distribution with the parameters you state.
Now, for $\pi(\mu|\tau,\mathbf{y})$ you do the same thing, but it's a bit trickier because you have to complete the square. After cancelling the relevant terms not involving $\mu$ you get:
\begin{equation}
\pi(\mu|\tau,\mathbf{y}) \propto \exp\{ -\frac{\tau}{2} \sum_{i=1}^{n} (y_i-\mu)^2 - \frac{\tau_0}{2}(\mu-\mu_0)^2\}.
\end{equation}
If you multiply out the squared brackets and take the relevant summations, then remove all terms not involving $\mu$ (as they are all just multiplying the kernel by a constant) this becomes:
\begin{equation}
\pi(\mu|\tau,\mathbf{y}) \propto \exp\{ -\frac{1}{2}[\mu^2(n\tau+\tau_0) - 2\mu(\tau n\bar{y} - \tau_0\mu_0)]\}.
\end{equation}
If you complete the square here you get the kernel for a Gaussian with the mean and variance you state.
Hope that helps.
Best Answer
Those distributions you call "marginal" are not marginal. They are conditional distributions because you wrote $x \mid y$. The marginal distribution of $X$, for example, is necessarily independent of the value of $Y$.
To see how the conditional distribution is gamma, all you have to do is write $$f_{X \mid Y}(x) = \frac{f_{X,Y}(x,y)}{f_Y(y)} \propto f_{X,Y}(x,y).$$ That is to say, the conditional distribution is proportional to the joint distribution, appropriately normalized. So we have $$f_{X \mid Y}(x) \propto x^2 e^{-x(y^2+4)},$$ completely ignoring any factors that are not functions of $x$. Then we recognize that the gamma distribution has density $$f_S(s) \propto s^{a-1} e^{-bs},$$ so the choice of shape $a = 3$ and rate $b = y^2+4$ demonstrates that the conditional distribution $X \mid Y \sim \operatorname{Gamma}(a = 3, b = y^2+4)$.
The conditional distribution of $Y \mid X$ is done in a similar fashion. Just ignore constants of proportionality: $$f_{Y \mid X}(y) \propto e^{-(x+1)y^2+2y},$$ but this one requires us to complete the square to get it to look like a normal density: $$-(x+1)y^2 + 2y = (x+1)\left(-\left(y - \tfrac{1}{x+1}\right)^2 \right) + \tfrac{1}{x+1},$$ and after exponentiating and removing the $e^{1/(x+1)}$ factor, comparing this against $$f_W(w) \propto e^{-(w-\mu)^2/(2\sigma^2)},$$ we see that we have a normal density with mean $\mu = 1/(x+1)$ and variance $\sigma^2 = 1/(2(x+1))$.
Now, if you wanted the marginal distributions, you would need to integrate: $$f_X(x) = \int_{y=-\infty}^\infty f_{X,Y}(x,y) \, dy,$$ for example. And as you can see, this expression will not be a function of $Y$. The difference is that if I simulated realizations of ordered pairs $(X_i, Y_i)$ from the joint distribution, the marginal density for $X$ would be what you would see if I only told you the values of $X_i$. The conditional distribution of $X$ given $Y = y$ would be what you would see if I only gave you the $X_i$ for which the corresponding $Y_i$ was equal to $y$.