You can deduce this from the classification of vector bundles on $\mathbf{P}^1$. Say $f:C \to \mathbf{P}^1$ is a connected finite etale Galois cover of degree $n$. We must show $n=1$.
The sheaf $E := f_* \mathcal{O}_C$ is a rank $n$ vector bundle on $\mathbf{P}^1$, so we can write it as $E \simeq \oplus_{i=1}^n \mathcal{O}(a_i)$ for some integers $a_i$. As $C$ is connected, we have $h^0(\mathbf{P}^1,E) = 1$, so we must have $a_1 = 0$ and $a_i < 0$ for $i > 1$ (after rearrangement). It then follows that after pullback along any finite cover $g:D \to \mathbf{P}^1$ of smooth connected curves, we still have $h^0(D, g^* E) = 1$ as negative line bundles remain negative after pullback along finite maps, and thus cannot acquire sections.
But $f$ was a finite etale cover, so there is some $g:D \to \mathbf{P}^1$ as above with $g^* C \to D$ is just a disjoint union of $n$ copies of $D$ mapping down (in fact, one may take $g = f$ as $f$ was Galois). But then $h^0(D, g^*E) = h^0(D, \oplus_{i=1}^n \mathcal{O}_D) = n$. The only way this can happen is if $n=1$.
Let me answer both of your explicitly asked questions in detail; if you have any further questions on the proof (which is indeed fast-going and slightly handwavy), please add them to your post.
Question 1. Why are the $s_1, s_2, \ldots, s_n$ algebraically independent over $\mathbb{Z}$ ? (I am using the notations from the book.)
Answer: Let $\mathbf{R}$ be the polynomial ring $\mathbb{Z}\left[c_0, c_1, \ldots, c_{n-1}\right]$ in $n$ new indeterminates $c_0, c_1, \ldots, c_{n-1}$ over $\mathbb{Z}$. Let $C$ be the $n\times n$-matrix
\begin{align}
\begin{pmatrix}
0 & 0 & \dots & 0 & -c_0 \\
1 & 0 & \dots & 0 & -c_1 \\
0 & 1 & \dots & 0 & -c_2 \\
\vdots & \vdots & \ddots & \vdots & \vdots \\
0 & 0 & \dots & 1 & -c_{n-1}
\end{pmatrix}
\end{align}
over $\mathbf{R}$. This matrix $C$ is the companion matrix of the polynomial $c_0 + c_1 T + c_2 T^2 + \cdots + c_{n-1} T^{n-1} + T^n \in \mathbf{R}\left[T\right]$ (in fact, I even stole the LaTeX code from the Wikipedia article). Now, let $\phi$ be the $\mathbb{Z}$-algebra homomorphism from $\mathbf{A}$ to $\mathbf{R}$ that sends each entry of the matrix $\left(a_{i,j}\right)$ to the corresponding entry of $C$ (that is, $a_{i,j}$ goes to $-c_{i-1}$ if $j = n$, goes to $1$ if $i = j+1$, and goes to $0$ otherwise). This $\phi$ is uniquely determined, due to the universal property of the polynomial ring $\mathbf{A}$. This homomorphism $\phi$ canonically induces a $\mathbb{Z}\left[T\right]$-algebra homomorphism $\phi\left[T\right] : \mathbf{A}\left[T\right] \to \mathbf{R}\left[T\right]$ (which simply applies $\phi$ to each coefficient independently). This latter homomorphism $\phi\left[T\right]$ sends the characteristic polynomial of the matrix $\left(a_{i,j}\right)$ to the characteristic polynomial of the matrix $C$ (since $\phi$ sends the matrix $\left(a_{i,j}\right)$ to the matrix $C$, and since each coefficient of the characteristic polynomial of a matrix is a universal polynomial in the entries of the matrix). In other words, $\phi\left[T\right]$ sends the polynomial $f\left(T\right)$ to the polynomial $c_0 + c_1 T + c_2 T^2 + \cdots + c_{n-1} T^{n-1} + T^n$. Thus, $\phi$ sends each coefficient $\left(-1\right)^{n-i}s_{n-i}$ of the former polynomial to the corresponding coefficient $c_i$ of the latter. Since the $c_0, c_1, \ldots, c_{n-1}$ are algebraically independent over $\mathbb{Z}$ (by their definition as distinct indeterminates!), we thus conclude that the $\left(-1\right)^n s_n, \left(-1\right)^{n-1} s_{n-1}, \ldots, -s_1$ are algebraically independent over $\mathbb{Z}$ as well (because any polynomial relation between them would be mapped by $\phi$ to a polynomial relation between $c_0, c_1, \ldots, c_{n-1}$, which would contradict the algebraic independence of the latter). Hence, the $s_1, s_2, \ldots, s_n$ are algebraically independent over $\mathbb{Z}$ (because changing the order of a bunch of elements and flipping some of their signs clearly cannot damage their algebraic independence).
Question 2. Why is the discriminant of $f$ nonzero?
Answer: We know that the coefficients $\pm s_1, \pm s_2, \ldots, \pm s_n$ of $f$ are algebraically independent. Thus, $f$ is "as good as" a generic monic polynomial of degree $n$ (in the sense that there is a ring isomorphism from the above-mentioned polynomial ring $\mathbf{R} = \mathbb{Z}\left[c_0, c_1, \ldots, c_{n-1}\right]$ to a subring of $\mathbf{A}$ that sends the generic monic polynomial $c_0 + c_1 T + c_2 T^2 + \cdots + c_{n-1} T^{n-1} + T^n$ to $f$). Thus, proving that the discriminant of $f$ is nonzero is equivalent to proving that the discriminant of the generic monic polynomial $c_0 + c_1 T + c_2 T^2 + \cdots + c_{n-1} T^{n-1} + T^n$ is nonzero. But the latter is easy: If the discriminant of the generic monic polynomial $c_0 + c_1 T + c_2 T^2 + \cdots + c_{n-1} T^{n-1} + T^n$ was zero, then the discriminant of every monic polynomial of degree $n$ would be $0$ (since it could be obtained by specializing the $c_0, c_1, \ldots, c_{n-1}$ in the discriminant of the generic monic polynomial); but this would contradict the fact that (for example) the monic polynomial $T^n - 1$ has nonzero discriminant.
Best Answer
This is impossible by the Mason-Stothers theorem (which holds over any algebraically closed field of characteristic zero).
We want to find $f, g, h$ such that $f + g = h$ where $g$ is a constant and $f, h$ have all of their roots repeated. If $g$ is nonzero, $f, h$ must be relatively prime. Letting $d = \deg f$, it follows that $fgh$ has at most $d$ roots, but by Mason-Stothers $fgh$ must have at least $d+1$ roots; contradiction.