$Pr(|X_n - 1| \geq \epsilon \mbox{ infintely often}) = \sum_n \int _{0}^{1 - \epsilon} nx^{n-1} dx = \sum _n (1 - \epsilon)^n < \infty $ for any $\epsilon > 0$.
Hence, by Borel Cantelli Lemma it converges to $1$ almost surely.
Okay so $\Omega=[0,1]$ and our probability measure is just the Lebesgue measure on the interval. Consider $X_k=\mathbb{1}_{[0,\frac{1}{k}]}$ now these converge almost surely and in probability. As it converges pointwise except for the 0. And $P(\{0\})=0$. So let's make those a bit more clever.
Now the issue is to write down a formula for this, but what happens if you move those intervals over the [0,1] interval before making them smaller? You could properly define this recursively. But it is probably easier to see from example:
$$1_{[0,1]},1_{[0,\frac12]},1_{[\frac12,1]},1_{[0,\frac13]},1_{[\frac23,\frac13]},...$$
Now the interval gets smaller or stays equal with every step. Which means that $\forall\varepsilon\ P(X_k>\varepsilon)$ converges to zero which is convergence in probability. But you can not find a zero set which is the only affected part. Because for every natural number you have a positive interval size which gets shifted over every part of the [0,1] intervall. Which means that actually the entire [0,1] intervall fulfills: $\exists m>k:X_m(x)>\varepsilon$ thus $P(\exists m>k:X_m>\varepsilon)=P([0,1])=1 \ \forall\varepsilon>0$
Note the difference between these two sets:
$$\{x\in[0,1]:X_k(x)>\varepsilon\}$$
which is a smaller and smaller set for increasing k. And:
$$\{x\in[0,1]:\exists m>k:X_m(x)>\varepsilon\} =[0,1]$$
Now there is actually a theorem which states that every sequence of random variables which converges in probability has a subsequence which converges almost surely. And you can easily construct such a sequence in our example. It is the sequence we started with.
So the difference is, that in one case you just say that the probability that it doesn't converge goes to zero. In the other case you can actually find a set which has probability zero, which is the only part in which it doesn't converge pointwise. This is a lot stronger, as the bad stuff is actually contained in a sense.
Best Answer
Note the definition of lim sup for a sequence $(A_n)$ of sets: $$ \limsup_{n\to\infty} A_n=\bigcap_{k=1}^\infty \bigcup_{n=k}^\infty A_n.$$ This translates to: $$ \omega\in\limsup_{n\to\infty} A_n \Leftrightarrow \forall k\, \exists n\ge k\colon \omega\in A_n.$$ Or, in everyday language: There are arbitrarily large $n$ for which $\omega\in A_n$. If you collect all such $n$ in a set and number them consecutively, you get an increasing sequence $(n_k)$ with $\omega\in A_{n_k}$ for all $k$.
That is the sequence mentioned in the proof, where $A_n=\{|X_n-X|\ge\epsilon\}$.
It matters whether lim sup is inside or outside the sets, because the two kinds of lim sup are entirely different beasts! But clearly, related.
The difference between sharp and non-sharp inequalities arise for $\omega$ such that $\limsup|X_n-X|=\epsilon$ exactly. Such an $\omega$ belongs in the set on the right end of $(*)$, but not in the left set. And it may or may not be in the middle set, depending on whether $|X_n-X|\ge\epsilon$ infinitely often or not. Both are possible, under the current assumption.