As you've undoubtedly noticed, you can't just argue as in the case of finite products, thinning out the sequence again and again to et convergence in more and more components. After any finite number of steps, you still have an infinite subsequence of your original sequence, but if you do infinitely many steps then every term of your original sequence might eventually get removed. Then, instead of having a subsequence at the end of the process, you've got nothing.
The idea of the diagonal argument is to slightly modify the process so that your sequence doesn't entirely disappear. Very roughly, you just restrain your thinning-out operations to ensure that an infinite subsequence remains at the end of the process. Here are the details:
Start with your original sequence, and, before doing any thinning, promise yourself that you will never delete the first of its terms; call that term $a_1$. Now thin out the sequence so that the first components converge, but, in accordance with your promise, keep $a_1$ in your new, thinned-out sequence. This does not harm the first-component-convergence. Keeping $a_1$ means that the sequence of first-components has one unavoidable term at the beginning, namely the first component of $a_1$, but one term at the beginning doesn't affect convergence.
So now you have your first thinned-out sequence, starting with $a_1$, and having its first-components converging. Now make a second promise, namely that the second term of this thinned-out sequence, which I'll call $a_2$, will never be deleted. Then thin out the sequence again, jut as in your finite-product proof, to make the sequence of second-components converge, but, while thinning it out, keep your two promises. That is, $a_1$ and $a_2$ are in this second thinned-out sequence. Again, you can do this because two terms at the beginning have no effect on convergence.
Continue in this way, alternating promises with thinnings. After $n$ steps, you have a subsequence of your original sequence with two crucial properties. (1) Its first, second, $\dots$, $n$-th components are convergent sequences, and (2) its first, second, $\dots$, $n$-th terms, which I'm calling $a_1,a_2,\dots,a_n$, will be the same in all future thinned-out sequences.
Now look at the infinite sequence $a_1,a_2,\dots$ consisting of the subjects of all your promises. For each $n$, its $n$-th components converge, because you have a subsequence of what you had after $n$ thinnings, and you ensured convergence of the $n$-th components at that stage.
This means that $a_1,a_2,\dots$ converges in the product topology. Since it's clearly a subsequence of the sequence you began with, the proof is complete.
Best Answer
Jasper’s answer contains the key idea, but it also contains a fair bit of unnecessary material. Start with your sequence $\big\langle\langle x_n,y_n\rangle:n\in\Bbb N\big\rangle$ in $X\times Y$. You know that $\langle x_n:n\in\Bbb N\rangle$ has a convergent subsequence $\langle x_{n_k}:k\in\Bbb N\rangle$ in $X$, say with limit $x$. Now consider the sequence $\langle y_{n_k}:k\in\Bbb N\rangle$ in $Y$: it has a convergent subsequence $\langle y_{n_{k_j}}:j\in\Bbb N\rangle$ in $Y$, say with limit $y$. Now show that $$\Big\langle\left\langle x_{n_{k_j}},y_{n_{k_j}}\right\rangle:j\in\Bbb N\Big\rangle$$ converges to $\langle x,y\rangle$ in $X\times Y$.