Every argument that I can see right now to show that (5) implies (4) either essentially goes through one of the other equivalent forms or uses a much more sophisticated result about metric spaces, namely, that every metric space is paracompact. This means that every open cover $\mathscr{U}$ of $X$ has a locally finite open refinement $\mathscr{V}$ covering $X$. That is,
- $\mathscr{V}$ is an open cover of $X$;
- for each $V\in\mathscr{V}$ there is a $U\in\mathscr{U}$ such that $V\subseteq U$; and
- each $x\in X$ has an open nbhd $N_x$ such that $\{V\in\mathscr{V}:N_x\cap V\ne\varnothing\}$ is finite.
Note that the third condition implies that each point of $X$ is in only finitely many members of $\mathscr{V}$, i.e., that $\mathscr{V}$ is point-finite. This is actually all that I need. (A space in which every open cover has a point-finite open refinement is said to be metacompact, so I’m actually using only the weaker result that every metric space is metacompact.)
Theorem: Every point-finite open cover of $X$ has an irreducible subcover, meaning one with no proper subcover.
Proof: Let $\mathfrak{R}=\{\mathscr{R}\subseteq\mathscr{V}:\mathscr{R}\text{ covers }X\}$; $\mathfrak{R}$ is partially ordered by $\supseteq$. Let $\mathfrak{C}$ be a chain in $\mathfrak{R}$, and let $\mathscr{C}=\bigcap\mathfrak{C}$; I claim that $\mathscr{C}\in\mathfrak{R}$, i.e., that $\mathscr{C}$ still covers $X$.
Proof of Claim: Suppose that some $x\in X$ is not covered by $\mathscr{C}$. Let $V_1,\dots,V_n$ be the finitely many members of $\mathscr{V}$ containing $x$. Then none of these $V_k$ can belong to $\mathscr{C}$ (or else $x$ would be covered by $\mathscr{C}$). But $\mathscr{C}$ is the intersection of the collections in the chain $\mathfrak{C}$, so for each $k=1,\dots,n$ there is some $\mathscr{C}_k\in\mathfrak{C}$ such that $V_k\notin\mathscr{C}_k$. Because $\mathfrak{C}$ is a chain, the collections $\mathscr{C}_1,\dots,\mathscr{C}_n$ are nested, and without loss of generality we may assume that the indexing has been chosen so that $\mathscr{C}_1\supseteq\dots\supseteq\mathscr{C}_n$. But then $\mathscr{C}_n$ contains none of the sets $V_1,\dots,V_n$, so $\mathscr{C}_n$ does not cover $x$, and hence $\mathscr{C}_n\notin\mathfrak{R}$, a contradiction.
We can now apply Zorn’s lemma to the partial order $\langle\mathfrak{R},\supseteq\rangle$ to conclude that $\mathfrak{R}$ has a maximal element $\mathscr{M}$ with respect to $\supseteq$: that is, $\mathscr{M}$ is in $\mathfrak{R}$, but no proper subcollection of $\mathscr{M}$ belongs to $\mathfrak{R}$. But then $\mathscr{M}$ is an open cover of $X$ with no proper subcover, i.e., an irreducible cover of $X$.$\dashv$
Now it’s easy to show that (5) implies (4). Suppose that every infinite open cover of $X$ has a proper subcover; this amounts to saying that every irreducible open cover of $X$ is finite. Let $\mathscr{U}$ be an open cover of $X$. By what we just showed, $\mathscr{U}$ has an irreducible subcover $\mathscr{V}$, and being irreducible, $\mathscr{V}$ must be finite. Thus, $X$ is compact.
You seem to have one of the right core idea with using pigeonhole principle, but there are serious issues with the details.
You started with the family of all $\varepsilon$-balls, and you know from compactness that you can find a finite subfamily of them that cover the space. You decided to denote the centres of this finite family by $x_1, \ldots, x_{n_\varepsilon}$, which is not permissible, since $x_1, \ldots x_{n_{\varepsilon}}$ already refers to the first $n_\varepsilon$ terms of the sequence $(x_n)$, which you fixed at the beginning of the proof. You have no reason to assume that the centres of your balls lie on the sequence points.
As I said, the pigeonhole principle idea is the right idea. If $y_1, \ldots, y_n$ are centres of open $\varepsilon$-balls that cover the space, then infinitely many terms of the sequence (as opposed to $X$, as your proof states) must enter one of the balls. This will lead to the right proof.
Then your proof goes straight to the conclusion. There is an issue with this too. You've established an infinite set of sequence points that are all within $2\varepsilon$ distance from each other. But, this doesn't necessarily imply the existence of a convergent subsequence. As it stands, you're defining a different subsequence for each $\varepsilon$, when you really need a single subsequence that works for each $\varepsilon$.
Consider, instead of using arbitrary $\varepsilon$, using $\varepsilon = \frac{1}{m}$, where $m \in \Bbb{N}$. Define your subsequence recursively. Start with $\varepsilon = \frac{1}{1}$, and by the pigeonhole principle, there must be a ball $B_1$ of radius $1$ containing infinitely many points of the sequence $(x_n)$. Also let,
$$S_1 = \{k \ge 1 : x_k \in B_1\}.$$
Note that $S_1$ is infinite, hence non-empty, and has a minimum element. We can make this minimum element $n_1$, so that $x_{n_1}$ is the first term in the subsequence.
Then, since $S_1$ is infinite, another pigeonhole principle application applied to it (as opposed to our original sequence) will yield another ball $B_2$, now of radius $\frac{1}{2}$, so that $x_k \in B_2$ for infinitely many $k \in S_1$. Let
$$S_2 = \{k > n_1 : x_k \in B_2\}$$
and let $n_2$ be the minimum of $S_2$.
Continue this process, and you'll obtain a sequence of nested sets
$$S_1 \supseteq S_2 \supseteq \ldots,$$
and their minimum elements form a subsequence $x_{n_1}, x_{n_2}, \ldots$ (if you've done it right, you should be able to justify why $n_1 < n_2 < n_3 < \ldots$, which is necessary to call it a subsequence).
Why is this subsequence convergent? And what does it converge to? Well, answering the latter question first, the closure of the sequence of open balls $(\overline{B}_m)$ form a nested sequence of compact sets, which must have a non-empty intersection. Since their diameters (which are less than or equal to $\frac{2}{m}$) converge to $0$, this intersection must be a singleton. Let $x$ be the unique point in this intersection.
Why is the subsequence converge to $x$? Well, for every $\varepsilon > 0$, find some $m$ such that $\frac{2}{m} < \varepsilon$. Then,
$$k \ge m \implies x_{n_k} \in \overline{B}_m \implies d(x, x_{n_k}) \le \frac{2}{m} < \varepsilon,$$
since $x \in \overline{B}_m$ (it's in all the $\overline{B}_i$s).
As you can see, there are some serious steps missing. The core ideas are pigeonhole principle, and Cantor's intersection theorem (guaranteeing the point of convergence).
Best Answer
We have a metric space $X$ with an infinite number of points and I assume you are taking as a definition of compactness that every sequence has a convergent subsequence.
Now, suppose there is a sequence $a_n$ of points in $X$ with no convergent subsequence. This means that for each $x\in X$, we can find open sets $U_x$ such that only finitely many $a_n$ are contained in $U_x$. Then, $\left \{ U_x \right \}_{x\in X}$ is an open cover of $X$ with no finite subcover, since any $\left \{ U_{x_{i}} \right \}_{1\leq i\leq n}$ can contain only finitely many $a_n$.