Th fact that the strong law of large numbers implies the weak law of large numbers in contains in the following property:
If $(\Omega,\mathcal F,\Bbb P)$ is a probability space and $\{X_n\}$ a sequence of real valued random variables with converges almost everywhere to $X$, it converges in probability to $X$.
To see that, we can consider $X_n\to 0$ and $X_n\geq 0$, replacing $X_n$ by $|X_n-X|$ if necessary. The set
$$C:=\bigcap_{p\geq 1}\bigcup_{n\geq 1}\bigcap_{k\geq n}\{\omega\in\Omega,X_k\leq \frac 1p\},$$
has by hypothesis, measure $1$. Fix $\varepsilon>0$ and $p_0$ with $p_0^{-1}\leq \varepsilon$. We have
\begin{align}
\limsup_k\Bbb P\{X_k\geq \varepsilon\}&\leq \limsup_k\Bbb P\{X_k\geq p_0^{-1}\}\\
&\leq\inf_{n\in\Bbb N}\Bbb P\left(\bigcup_{k\geq n}\{X_k\geq p_0^{—1}\}\right)\\
&\leq \Bbb P\left(\bigcap_{n\in\Bbb N}\bigcup_{k\geq n}\{X_k\geq p_0^{—1}\}\right)\\
&\leq \Bbb P(\Omega\setminus C)=0,
\end{align}
which gives the wanted result.
Before I start, I have to say that @Nate Eldredge is correct that you are trying to grasp some ideas that are several steps beyond your
mathematical background. But there are some things to say that will get you started thinking in useful directions.
Events of probability zero. In a sense, I can and will tell you it is 'impossible' to get an infinite run of Heads tossing a coin.
The probability of $n$ Heads in a row is $(1/2)^n,$ so your chances
of seeing 1000 Heads in a row are $9.332636 \times 10^{-302}.$ So for
all practical purposes, even a run of 1000 Heads is not going to happen. The only sensible probability to assign to an infinite
run of Heads is 0. By 'sensible probability' I mean it is the only way to have a consistent mathematical system in which the total
probability of all conceivable events adds to 1. (As in Kolmogorov's axioms of probability.)
I don't claim you can't conceive of the occurrence of an infinite run of Heads. However, there are very many other conceivable events
of probability zero tossing coins: an infinite sequence of Tails,
an infinite repetition of HTHTHT..., an infinite repetition of TTHHTTHH..., and so on. If you assign even a tiny positive probability to each of these, I hope you can see how the total probability of very many of them would soon exceed 1.
Convergence 'in probability.' Ordinary mathematical convergence
deals with deterministic events. For example, in the expression $\lim_{n \rightarrow \infty} (1 + 2n)/n = 2,$ I know that element
$n = 20$ in the sequence is $(1 + 2n)/n = 41/20 = 2.05.$ So it is
easy to establish that, with increasing $n,$ the sequence $a_n = (1 + 2n)/n$ gets--and remains--as close to $2$ as we like.
Now let's consider tossing a fair coin. Let $p_n$ be the proportion
of Heads among the first $n$ tosses. There is a sense in which
the 'limit' of $p_n$ as $n \rightarrow \infty$ is $1/2.$ But it cannot be ordinary mathematical convergence because $p_n$ is a
random quantity, not a deterministic one.
The 'cure' is to look at the sequence $B_n = P\{|p_n - 1/2| < \epsilon\}.$ The sequence $B_n$ is a deterministic numerical
sequence. For any positive $\epsilon,$ no matter how small,
one can compute the probability. And one can show
$\lim_{n \rightarrow \infty} B_n = 1.$ This is not the place
for a formal proof. (You can look up the Weak Law of Large Numbers.)
However, for $\epsilon = 0.01,$ one can compute $B_n$ for
increasing values of $n.$ The table below shows a few (rounded to
4 places). For $n = 100,000,$ we get $B_{100,000} \approx 1,$ to about 9 decimal places.
n B
100 0.2356
1000 0.4933
10000 0.9556
This kind of convergence is called convergence in probability.
On writes $\text{plim}_{n \rightarrow} p_n = 1/2$ or
$p_n \stackrel{p}{\rightarrow} 1/2.$ From a mathematical
point of view it is a relatively easy kind of convergence to
discuss because, for each individual value of $n,$ the computation of $B_n$
involves looking only at a binomial distribution with
parameters $n$ and $1/2.$
'Almost sure' convergence. The Strong Law of Large Numbers
requires a more sophisticated probability structure.
It states that $P\{\lim_{n \rightarrow \infty} p_n = 1/2\} = 1,$
also written
$p_n \stackrel{a.s.}{\rightarrow} 1/2.$ This formulation of
the convergence of a sequence of random variables requires
a probability structure that can deal with all sequences $p_n$
simultaneously. In essence, it has to make sense not only of probabilities of realistic events observable in the real world,
but at the same time with infinitely many 'conceivable' events of
probability zero, including the one you mentioned in your question.
("Almost sure" means "with probability 1." The terminology is
intended to exclude 'conceivable' events of probability 0.
One can argue whether "almost sure" is the best terminology
for the general public to grasp, but it is widely used in advanced probability theory.)
Note: In this discussion I have tried to compromise between
precise technical terminology and words that make sense in ordinary
English for people who have not studied advanced probability and measure
theory. This is not a treatise in measure theory. Nevertheless,
I anticipate comments from people who believe they can be
simultaneously intelligible to an elementary audience and more
technically precise. I would never claim my compromises are optimal.
Best Answer
Sometimes a weak law of large numbers gives you better quantitative information.
As a very simple example, suppose that the $X_i$ have finite variance $\sigma^2$. Then it is easy to see that $\operatorname{Var}(\frac{S_n}{n}) = \frac{\sigma^2}{n}$, and Chebyshev's inequality gives that for any $\epsilon > 0$, we have $$P\left(\left|\frac{S_n}{n}-\mu\right| > \epsilon\right) \le \frac{\sigma^2}{n \epsilon^2}.$$