I have now engaged in studying Galois Theory from NPTEL online lecture series which encompasses Finite Fields and Galois Theory. While watching the $48$-th lecture on Discriminant of a Polynomial a proposition has been discussed which I failed to understand properly.

Before going to the main proposition let us first define formally the discriminant of a polynomial.

Let $K$ be a field. Let $f_n$ denote general monic polynomial of degree $n$ i.e. it is of the form $$f_n = (X-X_1)(X-X_2) \cdots (X-X_n).$$

Let $V(X_1,X_2, \cdots, X_n)$ denote the Vandermonde deteminant in $X_1,X_2, \cdots X_n.$ So $$V(X_1,X_2, \cdots , X_n) = \prod\limits_{1 \leq i < j \leq n} (X_j – X_i).$$
Now the discriminant of $f_n$ is denoted by $D(f_n)$ and it is defined as $$D(f_n):= {V(X_1,X_2, \cdots , X_n)}^2 = \prod\limits_{1 \leq i < j \leq n} {(X_j – X_i)}^2.$$

Now let us take any monic polynomial $f \in K[X]$ of degree $n.$ Let $f=X^n + a_1 X^{n-1} + \cdots + a_n.$ Then by Kronecker's theorem $\exists$ a finite field extension $L|K$ such that $f$ splits completely into linear factors in $L[X].$ Let $x_1,x_2, \cdots , x_n$ be the zeros of $f$ lying in $L.$ Then it is clear that $(-1)^r a_r = S_r (x_1,x_2,\cdots , x_n)$ for $r=1,2, \cdots , n$ where $S_r$ is the $r$-th elementary symmetric polynomial in $n$-variables $X_1,X_2, \cdots , X_n$ i.e. $$S_r = \sum\limits_{1 \leq i_1 < i_2 < \cdots < i_r \leq n} X_{i_1} X_{i_2} \cdots X_{i_n}$$ for $r=1,2, \cdots , n.$
Now the discriminant of $f$ is denoted by $D(f)$ and is defined as $$\begin{align*} D(f) & = D(f_n) (-a_1, \cdots , (-1)^r a_r , \cdots , (-1)^n a_n ) \\ & = D(f_n) (S_1(x_1,x_2, \cdots , x_n), S_2(x_1,x_2, \cdots , x_n), \cdots , S_n (x_1,x_2, \cdots , x_n)). \end{align*}$$

By Fundamental Theorem of Symmetric Polynomials it is easy to show that $D(f) \in K.$ Now let us come back to the main proposition.

$\textbf {Proposition} :$ Let $f(X) \in K[X]$ be a monic polynomial of degree $n$ and $x_1,x_2, \cdots , x_n \in L$ be all zeros of $f$ in a finite field extension $L|K.$ Then $$D(f)= {V(x_1,x_2, \cdots , x_n)}^2 = \prod\limits_{1 \leq i < j \leq n} (x_j – x_i)^2.$$

In the proof of the above proposition the instructor wrote down an equality without giving any proper reasoning behind it. He said that $$D(f_n) (-a_1, \cdots , (-1)^r a_r , \cdots ,(-1)^n a_n ) = D(f_n) (x_1,x_2, \cdots , x_n).$$

But why is it always the case? The thing what he wrote implies $$D(f_n)(x_1,x_2, \cdots , x_n) = D(f_n) (S_1(x_1,x_2, \cdots ,x_n), S_2(x_1,x_2. \cdots , x_n), \cdots , S_n (x_1,x_2, \cdots ,x_n)).$$

But I don't understand why it necessarily holds. For instance let $K= \Bbb Q$ and $L=\Bbb Q (\sqrt 2).$ Let $f=X^2-2 \in \Bbb Q[x].$ Then $f$ splits completely into linear factors in $L[X].$ The zeros of $f$ are $\pm \sqrt 2 \in L.$ Let $x_1 = \sqrt 2$ and $x_2 = -\sqrt 2.$ Then $S_1(x_1,x_2) = x_1 + x_2 = \sqrt 2 – \sqrt 2 = 0$ and $S_2(x_1,x_2) = x_1x_2 = \sqrt 2 (- \sqrt 2) = -2.$ If the equality holds then we must have $D(f_2)(\sqrt 2 , – \sqrt 2) = D(f_2) (0,-2).$ But $D(f_2) (\sqrt 2, – \sqrt 2) = 8 \neq 4 = D(f_2) (0,-2).$ So the equality is in general false. So ultimately we get a false proof of the above proposition.

How do I manage to overcome the mistake in the lecture to prove the above proposition? Any suggestion regarding this will be highly appreciated.

What I observed is that the actual problem lies in the definition of discriminant of a monic polynomial. Below is a way to prove the desired proposition by redefining the discriminant of a monic polynomial properly in the following way $:$

Let us first state the following theorem due to Jacobi without proof (the proof is very simple thoough!)

Theorem $:$ Let $V = V(X_1,X_2, \cdots , X_n) = \prod\limits_{1 \leq i < j \leq n} (X_j - X_i) \in K[X_1,X_2, \cdots , X_n),$ the Vandermonde's determinant in $n$ unknowns $X_1,X_2, \cdots , X_n.$ Then for any $\sigma \in S_n$ $$\sigma (V) = \text{sgn} (\sigma)\ V$$ where $\text {sgn} (\sigma)$ is defined as follows $:$

$$ \text {sgn} (\sigma) = \left\{ \begin{array}{ll} 1 & \quad \text {if}\ \sigma\ \text {is even} \\ -1 & \quad \text {if}\ \sigma\ \text{is odd} \end{array} \right. $$

With the help of the above theorem it is easy to see that $D(f_n),$ the discriminant of the general monic polynomial of degree $n,$ is fixed by every permutation $\sigma \in S_n.$ Because $D(f_n) = V^2 = \prod\limits_{1 \leq i < j \leq n} (X_j - X_i)^2 \in K[X_1,X_2, \cdots , X_n].$ So for any $\sigma \in S_n$ when it extends to an automorphism of $K(X_1,X_2, \cdots ,X_n)$ defined by $X_i \mapsto X_{\sigma(i)}$ for all $i=1,2,\cdots , n$ and leaving all elements of $K$ fixed then we have $\sigma (D(f_n)) = \sigma (V^2) = {\sigma (V)}^2 = V^2,$ because for any permuatation $\sigma \in S_n$ we have ${\text {sgn}(\sigma)}^2 = 1.$ This shows that $D(f_n)$ is a symmetric polynomial in $X_1,X_2, \cdots , X_n.$ So by Fundamental theorem of Symmetric Polynomials (also known as Newton's theorem) it follows that $\exists$ $D \in K[X_1,X_2, \cdots , X_n]$ such that $D(f_n) = D(S_1,S_2, \cdots , S_n)$ where $S_i$ is the $i$-th elementary symmetric polynomial in $X_1,X_2, \cdots , X_n.$ Now let $f = X^n + a_1 X^{n-1} + \cdots + a_n \in K[X]$ be a monic polynomial. Let us denote discriminant of $f$ by $\text {Disc} (f)$ (for avoiding confusion with $D$ I already defined). Then $\text {Disc} (f)$ is defined as follows $:$ $$\text {Disc} (f) : = D(-a_1, \cdots , (-1)^i a_i, \cdots , (-1)^na_n).$$

With the help of the revised definition of Discriminant of a Monic Polynomial it is now very easy to prove the desired proposition.

Let $x_1,x_2, \cdots , x_n$ be the zeros of $f$ lying in some finite field extension $L|K.$ Then we first note that $$S_r (x_1,x_2, \cdots , x_n) = (-1)^r a_r$$ for $r=1,2, \cdots , n.$ Then we have $$\begin{align*} \prod\limits_{1 \leq i < j \leq n} (x_j - x_i)^2 & = D(f_n) (x_1,x_2, \cdots , x_n)\\ & = D(S_1(x_1,x_2, \cdots , x_n), S_2(x_1,x_2, \cdots , x_n), \cdots , S_n(x_1,x_2, \cdots , x_n))\\ & = D(-a_1, \cdots , (-1)^i a_i , \cdots , (-1)^na_n)\\ & = \text {Disc} (f). \end{align*}$$

So we have $\text {Disc} (f) = \prod\limits_{1 \leq i < j \leq n} (x_j - x_i)^2 = {V(x_1,x_2, \cdots , x_n)}^2,$ as required.

This completes the proof of the proposition.


