As you've undoubtedly noticed, you can't just argue as in the case of finite products, thinning out the sequence again and again to et convergence in more and more components. After any finite number of steps, you still have an infinite subsequence of your original sequence, but if you do infinitely many steps then every term of your original sequence might eventually get removed. Then, instead of having a subsequence at the end of the process, you've got nothing.
The idea of the diagonal argument is to slightly modify the process so that your sequence doesn't entirely disappear. Very roughly, you just restrain your thinning-out operations to ensure that an infinite subsequence remains at the end of the process. Here are the details:
Start with your original sequence, and, before doing any thinning, promise yourself that you will never delete the first of its terms; call that term $a_1$. Now thin out the sequence so that the first components converge, but, in accordance with your promise, keep $a_1$ in your new, thinned-out sequence. This does not harm the first-component-convergence. Keeping $a_1$ means that the sequence of first-components has one unavoidable term at the beginning, namely the first component of $a_1$, but one term at the beginning doesn't affect convergence.
So now you have your first thinned-out sequence, starting with $a_1$, and having its first-components converging. Now make a second promise, namely that the second term of this thinned-out sequence, which I'll call $a_2$, will never be deleted. Then thin out the sequence again, jut as in your finite-product proof, to make the sequence of second-components converge, but, while thinning it out, keep your two promises. That is, $a_1$ and $a_2$ are in this second thinned-out sequence. Again, you can do this because two terms at the beginning have no effect on convergence.
Continue in this way, alternating promises with thinnings. After $n$ steps, you have a subsequence of your original sequence with two crucial properties. (1) Its first, second, $\dots$, $n$-th components are convergent sequences, and (2) its first, second, $\dots$, $n$-th terms, which I'm calling $a_1,a_2,\dots,a_n$, will be the same in all future thinned-out sequences.
Now look at the infinite sequence $a_1,a_2,\dots$ consisting of the subjects of all your promises. For each $n$, its $n$-th components converge, because you have a subsequence of what you had after $n$ thinnings, and you ensured convergence of the $n$-th components at that stage.
This means that $a_1,a_2,\dots$ converges in the product topology. Since it's clearly a subsequence of the sequence you began with, the proof is complete.
Best Answer
An observation which makes my life easier: the infinite power like $[0,1]^{\mathbb{R}}$ (in the product topology) only "depends" on the size of the index set: if $\phi: I \rightarrow J$ is a bijection then $[0,1]^I$ is homeomorphic to $[0,1]^J$ by "shuffling coordinates" : for $f \in [0,1]^I$ define $h(f) \in [0,1]^J$ (recall that elements of the power are just functions from $I$ or $J$ into $[0,1]$) by $h(f)(j) = f(\phi^{-1}(j))$, so that $\pi_j \circ h = \pi_{\phi^{-1}(j)}$. The last identity shows that $h$ is continuous (the compositions with projections are continuous) and $h$ has an obvious inverse $\hat{h}(f)(i) = f(\phi(i))$ for all $i \in I, f \in [0,1]^J$, also continuous for the same reasons.
So, as $|\mathbb{R}| \simeq |\{0,1\}^\mathbb{N}|$, we might as well use the latter set as the index set for the power.
Define a sequence $f_n \in [0,1]^{\{0,1\}^\mathbb{N}}$ by $f_n(\omega) = \omega_n$, where $\omega \in \{0,1\}^\mathbb{N}$,( so just a sequence of $0$'s and $1$'s). This sequence has no convergent subsequence by a standard diagonal argument:
Suppose it has a convergent subsequence, so that there are $n_1 < n_2 , <\ldots n_k < \ldots$ in $\mathbb{N}$ such that there is some $f \in [0,1]^{\{0,1\}^\mathbb{N}}$ such that $f_{n_k} \rightarrow f$ as $k \rightarrow \infty$. Because we are working in the product topology (a.k.a. the topology of pointwise convergence) this exactly means (or at least implies by continuity of projections) that
$$\forall \omega \in \{0,1\}^\mathbb{N}: f_{n_k}(\omega) \rightarrow f(\omega), \text{ as } k \rightarrow \infty$$
Now define a special sequence $\hat{\omega} \in \{0,1\}^\mathbb{N}$ as follows: $\hat{\omega}_{n_{2k}} = 1$ for all $k$, and $\hat{\omega}_n = 0$ for all other $n$ not of the form $n_{2k}$.
For $\hat{\omega}$ we have that $f(\hat{\omega})$ equals the limit of $f_{n_{2k}}(\hat{\omega}) = \omega_{n_{2k}} \equiv 1$ for the subsequence $f_{n_{2k}}, k \rightarrow \infty$ so $f(\hat{\omega}) =1$ but also $f(\hat{\omega}) =\lim_k f_{n_{2k+1}}(\hat{\omega}) = \omega_{n_{2k+1}} = 0$, as a constant sequence of $0$'s. So the pointwise convergence fails for this coordinate $\hat{\omega}$, so there can be no convergent subsequence.
As an aside: we could have played the same trick using the reals as the index set and using decimal expansions (but then you'd have to be precise about ambiguous expansions (do you use 0.99999 or 1.0000) etc.) As a diagonal argument I find using the $0-1$-sequences as index set "cleaner"; we could also have used the power set $\mathscr{P}(\mathbb{N})$ as an indexing set and the "equivalent" sequence $f_n(A) = 1$ iff $n \in A$, which makes the analogy to Cantor's argument a bit more direct...)
It is quite a lot harder to show that this size of the index set (continuum, $\mathfrak{c}$) is the smallest for which this always happens : e.g. if the continuum hypothesis fails and $\aleph_1 < \mathfrak{c}$, it is consistent that $[0,1]^{\aleph_1}$ is sequentially compact, even though the index set is uncountable while $[0,1]^{\mathfrak{c}}$ will always be compact and not sequentially compact (the same argument remains valid).
Large products also give (IMHO) natural examples of the reverse phenomenon: sequentially compact spaces that are not compact:
Define the following subspace of $[0,1]^I$ (where $I$ is uncountable):
$$\Sigma [0,1]^I = \{f \in [0,1]^I: \sup(f) \text{ at most countable }\}$$
were $\sup(f) = \{i \in I: f(i) \neq 0\}$, so the set of functions that are $0$ almost everywhere (except on the countable support set $\sup(f)$.
This set is easily shown to be dense in $[0,1]^I$ (even finite support would do, using basic open sets), so cannot be compact (it would be closed in [0,1]^I, not dense..) but it is sequentially compact: suppose $(f_n)$ is a sequence in $\Sigma[0,1]^I$, define $J = \bigcup_n \sup(f_n) \subset I$, which is countable as a countable union of countable sets. Note that by definition, for all $n$ and all $i \notin J$: $f_n(i) = 0$, so outside $J$ we have a constant $0$ sequence in all coordinates. Then observe that $[0,1]^J$ is homeomorphic to $[0,1]^\mathbb{N}$, the Hilbert cube, which is metrisable in the infinite product metric (remember the first remark of this post). $[0,1]^J$ being compact metrisable, is sequentially compact so there is a convergent subsequence $f_{n_k}\rightarrow f$ in $[0,1]^J$. So we have pointwise convergence to $f$ on $J$ and setting $f(i) = 0$ for all $i \notin J$, we have it for all coordinates.
So $\Sigma[0,1]^I$ is sequentially compact and not compact. So neither property implies the other in general. (they do for metric spaces as is well known).