Your effort was correct, though it could be simpler. You just used the wrong formula for correlation.
Here's my shot.
Let $H_1,H_2,H_3$ be the indicators of a Head on the relevant toss. There are independent events, with $$\begin{align}\mathsf E(H_n)=&~\tfrac 1 2\\[1ex]\mathsf {Var}(H_n)=&~\tfrac 1 4 \\[1ex] \mathsf {Cov}(H_n,H_m)\vert_{m\neq n} =&~ 0 \\[1ex] \mathsf {Var}(H_n+H_m)\vert_{n\neq m}=&~\tfrac 1 2 \\[2ex]\mathsf {Cov}(X,Y) =&~ \mathsf {Cov}(H_1+H_2,H_2+H_3) \\[1ex] =&~ \mathsf {Cov}(H_1,H_2)+\mathsf {Cov}(H_2,H_2)+\mathsf {Cov}(H_1,H_3)+\mathsf {Cov}(H_2,H_3) \\[1ex]=&~ \tfrac 14 \\[2ex]\mathsf {Corr}(X,Y)=&~ \dfrac{\mathsf {Cov}(X,Y)}{\surd\mathsf {Var}(X)\surd\mathsf {Var}(Y)} \\[1ex] = & ~ \dfrac{1}{2}\end{align}$$
Also $\mathsf {Corr}(X,Z)$ $= \mathsf {Corr}(X,2-Y)\\=-\tfrac 12$.
One suggestion is to work with copulas. In a nutshell, a copula allows you to separate out the dependency structure of a distribution function. Say, $F_1,F_2,\ldots,F_n$ are the 1D marginals of a distribution $F$ then the copula $C$ is the function defined as
$$C(u_1,u_2,\ldots,u_n)=F(F^{-1}_1(u_1),F^{-1}_1(u_2),\ldots,F^{-1}_n(u_n))$$
This makes $C$ a function from $[0,1]^n$ to $[0,1]$. For instance, if you take the bivariate normal distribution, by doing the computation above, you'll find the Gaussian copula
$$C^{\text{Gauss}}_{\rho}=\int_{-\infty}^{\phi^{-1}(u_1)}\int_{-\infty}^{\phi^{-1}(u_2)}\frac{1}{2\pi\sqrt{1-\rho^2}}\exp\left(-\frac{s_1^2-2\rho s_1s_2+s_2^2}{2(1-\rho^2)}\right)ds_1ds_2$$
I used the package copula in R to illustrate. If you just take the copula as such, it is as if you constructed a probability distribution with the dependency structure of a bivariate normal, but with uniform marginals. So I generated 1000 random vectors from a Gaussian copula with $\rho=0.5$. Here's the code
library(copula);
norm.cop <- normalCopula(0.5);
u <- rCopula(1000, norm.cop);
plot(u,col='blue',main='Random variables, uniform marginals, gaussian copula, > rho=0.5',xlab='X1',ylab='X2')
cor(u);
and the result
I also computed the sample correlation which is $0.5060224$.
I also computed a plot to show you the marginals are indeed uniform
dom<-(1:length(u[,1]))/length(u[,1]);
par(mfrow=c(1,2));
plot(dom,sort(u[,1]),col='blue',main='marginal X1');
abline(0,1,col='red');
plot(dom,sort(u[,2]),col='blue',main='marginal X2');
abline(0,1,col='red');
This is all very nice, but there are a number of pitfalls that have to be discussed:
- Copula's for discrete distributions are a real can of worms.
- If we can use a multivariate Gaussian distribution to get a dependency structure, why not use a multivariate student t? Or a multivariate Pareto? Or other dependencies which are much more exotic, but all could in principle also lead to a $0.5$ correlation if you set the parameters right.
- Given marginals and a correlation, it is not always the case that you can construct a copula and hence a multivariate distribution with the desired properties. A nice example is given in Embrechts (2009), "Copulas: A Personal View", The Journal of Risk and Insurance, Vol. 76, No. 3, 639-650. The example shows that if you want the marginals to be lognormal distributed $LN(0,1)$ and $LN(0,6)$ respectively, your correlation is restricted to the range $[-0.00025,0.01372]$. The heavy tails of the lognormals essentially constrain you to that range. This can be proven from the Fréchet-Hoeffding bounds. More details are in the article.
More can be said and I think the article I quoted in my last item is a nice starting point.
Best Answer
We have $\mathbb P(Y_k=i)=p_i$ for $i=1,2,3,4,5,6$ and $k=1,2,\ldots,n$. Then $X_i = \sum_{k=1}^n \mathsf 1_{\{Y_k=i\}}$. It follows that \begin{align} \operatorname{Cov}(X_1,X_2) &= \mathbb E[X_1X_2]-\mathbb E[X_1]\mathbb E[X_2]\\ &= \mathbb E\left[\left(\sum_{k=1}^n\mathsf 1_{\{Y_k=1\}}\right)\left(\sum_{k=1}^n\mathsf 1_{\{Y_k=2\}}\right)\right] - (np_1)(np_2).\\ \end{align} Now, $\mathsf 1_{\{Y_i=1\}}\mathsf 1_{\{Y_i=2\}}=0$ so $\mathbb E\left[\mathsf 1_{\{Y_i=1\}}\mathsf 1_{\{Y_i=2\}}\right]=0$ and for $i\ne j$, $$\mathbb E\left[\mathsf 1_{\{Y_i=1\}}\mathsf 1_{\{Y_j=2\}}\right]=\mathbb P(Y_i=1,Y_j=2)=\mathbb P(Y_i=1)\mathbb P(Y_j=2) = p_1p_2.$$ Hence $$\operatorname{Cov}(X_1,X_2) = (n^2-n)p_1p_2 -n^2p_1p_2 =-np_1p_2<0, $$ so $X_1$ and $X_2$ are negatively correlated.