The problem is here:
If any $x\in X$ had for each of its neighborhood $U$ infinitely many $n$ for which $x_n \in U$, then we could define a convergent subsequence of $(x_n)$, contradicting our assumption. (Presumably this is done by choosing for each neighborhood a sufficiently-large-indexed term in that neighborhood.)
In general topological spaces this only implies that we are able to construct a convergent net, not a convergent sequence. (A point $x$ is an accumulation point of a subset $S$ $\Leftrightarrow$ there exists a net of points of $S\setminus\{x\}$ converging to $x$.)
If $X$ is first countable at $x$ (the point $x$ has a countable base), then a sequence can be constructed. (This is more-or-less standard. We first construct a decreasing base $U_n$ at $x$ and then choose a point from each $U_n$). In particular, this works for metric spaces. Note that Rudin works only with compact subset of metric spaces in that chapter.
As you've undoubtedly noticed, you can't just argue as in the case of finite products, thinning out the sequence again and again to et convergence in more and more components. After any finite number of steps, you still have an infinite subsequence of your original sequence, but if you do infinitely many steps then every term of your original sequence might eventually get removed. Then, instead of having a subsequence at the end of the process, you've got nothing.
The idea of the diagonal argument is to slightly modify the process so that your sequence doesn't entirely disappear. Very roughly, you just restrain your thinning-out operations to ensure that an infinite subsequence remains at the end of the process. Here are the details:
Start with your original sequence, and, before doing any thinning, promise yourself that you will never delete the first of its terms; call that term $a_1$. Now thin out the sequence so that the first components converge, but, in accordance with your promise, keep $a_1$ in your new, thinned-out sequence. This does not harm the first-component-convergence. Keeping $a_1$ means that the sequence of first-components has one unavoidable term at the beginning, namely the first component of $a_1$, but one term at the beginning doesn't affect convergence.
So now you have your first thinned-out sequence, starting with $a_1$, and having its first-components converging. Now make a second promise, namely that the second term of this thinned-out sequence, which I'll call $a_2$, will never be deleted. Then thin out the sequence again, jut as in your finite-product proof, to make the sequence of second-components converge, but, while thinning it out, keep your two promises. That is, $a_1$ and $a_2$ are in this second thinned-out sequence. Again, you can do this because two terms at the beginning have no effect on convergence.
Continue in this way, alternating promises with thinnings. After $n$ steps, you have a subsequence of your original sequence with two crucial properties. (1) Its first, second, $\dots$, $n$-th components are convergent sequences, and (2) its first, second, $\dots$, $n$-th terms, which I'm calling $a_1,a_2,\dots,a_n$, will be the same in all future thinned-out sequences.
Now look at the infinite sequence $a_1,a_2,\dots$ consisting of the subjects of all your promises. For each $n$, its $n$-th components converge, because you have a subsequence of what you had after $n$ thinnings, and you ensured convergence of the $n$-th components at that stage.
This means that $a_1,a_2,\dots$ converges in the product topology. Since it's clearly a subsequence of the sequence you began with, the proof is complete.
Best Answer
The existence of the real number $r\in[0,1]=I$ such that its binary expansion has $k$th digit $0$ if $k$ is odd and $1$ if $k$ is even does not depend on the Axiom of Choice in any way whatsoever. This is nothing more than the number $$r = \sum_{i=0}^{\infty}\frac{1}{2^{2i+1}},$$ which is a convergent series of real numbers, being a series of positive terms that is bounded above by $$\sum_{i=1}^{\infty}\frac{1}{2^i} = 1.$$
Where exactly do you believe that this requires the Axiom of Choice? If your real numbers are defined as equivalence classes of Cauchy sequences, then $r$ is the equivalence class of the sequence of its partial sums, which can be defined using induction/recursion theorem. If your real numbers are defined as Dedekind cuts, then $r$ is the cut determined by the union of the cuts determined by the partial sums, which again can be defined using recursion/induction.
(P.S: $$\begin{align*} r &= \sum_{i=0}^{\infty}\frac{1}{2^{2i+1}} = \frac{1}{2}\sum_{i=0}^{\infty}\frac{1}{2^{2i}}\\ &= \frac{1}{2}\sum_{i=0}^{\infty}\left(\frac{1}{4}\right)^i\\ &=\frac{1}{2}\left(\frac{1}{1-\frac{1}{4}}\right)\\ &= \frac{1}{2}\left(\frac{4}{3}\right)\\ &=\frac{2}{3}. \end{align*}$$ and, luckily, $\frac{2}{3}$ exists, even without the Axiom of Choice...)