EDIT. Here is the part of the answer that has been rewritten:
We give below a short proof of the Fundamental Theorem of Galois Theory (FTGT) for finite degree extensions. We derive the FTGT from two statements, denoted (a) and (b). These two statements, and the way they are proved here, go back at least to Emil Artin (precise references are given below).
The derivation of the FTGT from (a) and (b) takes about four lines, but I haven't been able to find these four lines in the literature, and all the proofs of the FTGT I have seen so far are much more complicated. So, if you find either a mistake in these four lines, or a trace of them the literature, please let me know.
The argument is essentially taken from Chapter II (link) of Emil Artin's Notre Dame Lectures [A]. More precisely, statement (a) below is implicitly contained in the proof Theorem 10 page 31 of [A], in which the uniqueness up to isomorphism of the splitting field of a polynomial is verified. Artin's proof shows in fact that, when the roots of the polynomial are distinct, the number of automorphisms of the splitting extension coincides with the degree of the extension. Statement (b) below is proved as Theorem 14 page 42 of [A]. The proof given here (using Artin's argument) was written with Keith Conrad's help.
Theorem. Let $E/F$ be an extension of fields, let $a_1,\dots,a_n$ be distinct generators of $E/F$ such that the product of the $X-a_i$ is in $F[X]$. Then
the group $G$ of automorphisms of $E/F$ is finite,
there is a bijective correspondence between the sub-extensions $S/F$ of $E/F$ and the subgroups $H$ of $G$, and we have
$$
S\leftrightarrow H\iff H=\text{Aut}(E/S)\iff S=E^H
$$
$$
\implies[E:S]=|H|,
$$
where $E^H$ is the fixed subfield of $H$, where $[E:S]$ is the degree (that is the dimension) of $E$ over $S$, and where $|H|$ is the order of $H$.
PROOF
We claim:
(a) If $S/F$ is a sub-extension of $E/F$, then $[E:S]=|\text{Aut}(E/S)|$.
(b) If $H$ is a subgroup of $G$, then $|H|=[E:E^H]$.
Proof that (a) and (b) imply the theorem. Let $S/F$ be a sub-extension of $E/F$ and put $H:=\text{Aut}(E/S)$. Then we have trivially $S\subset E^H$, and (a) and (b) imply
$$
[E:S]=[E:E^H].
$$
Conversely let $H$ be a subgroup of $G$ and set $\overline H:=\text{Aut}(E/E^H)$. Then we have trivially $H\subset\overline H$, and (a) and (b) imply $|H|=|\overline H|$.
Proof of (a). Let $1\le i\le n$. Put $K:=S(a_1,\dots,a_{i-1})$ and $L:=K(a_i)$. It suffices to check that any $F$-embedding $\phi$ of $K$ in $E$ has exactly $[L:K]$ extensions to an $F$-embedding $\Phi$ of $L$ in $E$; or, equivalently, that the polynomial $p\in\phi(K)[X]$ which is the image under $\phi$ of the minimal polynomial of $a_i$ over $K$ has $[L:K]$ distinct roots in $E$. But this is clear since $p$ divides the product of the $X-a_j$.
Proof of (b). In view of (a) it is enough to check $|H|\ge[E:E^H]$. Let $k$ be an integer larger than $|H|$, and pick a
$$
b=(b_1,\dots,b_k)\in E^k.
$$
We must show that the $b_i$ are linearly dependent over $E^H$, or equivalently that $b^\perp\cap(E^H)^k$ is nonzero, where $\bullet^\perp$ denotes the vectors orthogonal to $\bullet$ in $E^k$ with respect to the dot product on $E^k$. Any element of $b^\perp\cap (E^H)^k$ is necessarily orthogonal to $hb$ for any $h\in H$, so
$$
b^\perp\cap(E^H)^k=(Hb)^\perp\cap(E^H)^k,
$$
where $Hb$ is the $H$-orbit of $b$. We will show $(Hb)^\perp\cap(E^H)^k$ is nonzero. Since the span of $Hb$ in $E^k$ has $E$-dimension at most $|H| < k$, $(Hb)^\perp$ is nonzero. Choose a nonzero vector $x$ in $(Hb)^\perp$ such that $x_i=0$ for the largest number of $i$ as possible among all nonzero vectors in $(Hb)^\perp$. Some coordinate $x_j$ is nonzero in $E$, so by scaling we can assume $x_j=1$ for some $j$. Since the subspace $(Hb)^\perp$ in $E^k$ is stable under the action of $H$, for any $h$ in $H$ we have $hx\in(Hb)^\perp$, so $hx-x\in(Hb)^\perp$. Since $x_j=1$, the $j$-th coordinate of $hx-x$ is $0$, so $hx-x=0$ by the choice of $x$. Since this holds for all $h$ in $H$, $x$ is in $(E^H)^k$.
[A] Emil Artin, Galois Theory, Lectures Delivered at the University of Notre Dame, Chapter II, available here.
PDF version: http://www.iecl.univ-lorraine.fr/~Pierre-Yves.Gaillard/DIVERS/Selected_Texts/st.pdf
Here is the part of the answer that has not been rewritten:
Although I'm very interested in the history of Galois Theory, I know almost nothing about it. Here are a few things I believe. Thank you for correcting me if I'm wrong. My main source is
http://www-history.mcs.st-and.ac.uk/history/Projects/Brunk/Chapters/Ch3.html
Artin was the first mathematician to formulate Galois Theory in terms of a lattice anti-isomorphism.
The first publication of this formulation was van der Waerden's "Moderne Algebra", in 1930.
The first publications of this formulation by Artin himself were "Foundations of Galois Theory" (1938) and "Galois Theory" (1942).
Artin himself doesn't seem to have ever explicitly claimed this discovery.
Assuming all this is true, my (probably naive) question is:
Why does somebody who makes such a revolutionary discovery wait so many years before publishing it?
I also hope this is not completely unrelated to the question.
Best Answer
Your usage of "elementary" is correct; your definition is the one that most number theorists would use. You don't have to take my word for it however; just consider the first sentence of Selberg's Elementary Proof of the Prime Number Theorem:
In this paper will be given a new proof of the prime-number theorem, which is elementary in the sense that it uses practically no analysis, except the simplest properties of the logarithm.
Ironically, of the many known proofs of the prime-number theorem, this elementary proof ranks as one of the most complicated.