I'll speak about their grammatical differences, leaving their proof- and model-theoretic differences for someone more qualified to discuss. Each of these logics has a vocabulary $V$, which is the set of symbols out of which its well-formed formulas (e.g. terms, sentences) are generated. One usually singles out a subset of $V$ as the set of logical vocabulary $V_L$. It is these $V_L$s that distinguish logics at the ground level, making it very transparent which is an extension of which. Let's see:
$V_L$(PL) = { '$\lnot$' , '$\land$' }
$V_L$(FOL) = $V_L$(PL) $\cup$ { '=' , ' $\forall_1$ ' } where $\forall_1$ quantifies over individuals
$V_L$(SOL) = $V_L$(FOL) $\cup$ { ' $\forall_2$' } where $\forall_2$ quantifies over properties (of individuals)
$V_L$(HOL) = $V_L$(FOL) $\cup$ { ' $\forall_n$' } where $\forall_n$ quantifies over yet higher-order properties
$V_L$(TT) = $V_L$(_OL) $\cup$ { ' $\lambda$' } where _OL is a _-order logic (usually _ > 0)
Of course, each of these systems could be defined in different ways, choosing different sets of logical vocabulary. This is just one way of going about it. Now, as you already said, each of these logics extends the ones coming before it. With this vocabulary talk we can give precise meaning to that:
Def. Logic A is an extension of logic B iff $V_L$(B) $\subset$ $V_L$(A).
In the event that the converse doesn't hold, A is said to be a proper extension of B.
Lastly, for specific examples of differences, consider these formulas:
PL: '$\phi \lor \lnot \phi$'
FOL: '$\forall x (x = x)$'
SOL: '$(a = b) \equiv \forall P (P(a) \leftrightarrow P(b))$'
TT: $\forall x ([\lambda x. x](x) = x)$
Each of these sentences is also valid for logics following it (the other direction doesn't hold, of course). Notice that higher-order logic is left out, because there is no sentence $\phi$ s.t. HOL $\models \phi$ but SOL $\not\models \phi$, due to the fact that the power-set operation is SOL-expressible (Hintikka 1995).
For corrections/suggestions, please leave a comment or simply edit this post.
First-order logic is the rules that determine which propositions follow from other propositions.
More rigorously we can say that "(classical) first-order logic" consists of (1) a set of rules for what well-formed formulas are, given a particular non-logical vocabulary, and (2) a transitive(ish) relation $\vdash$ between finite sets of wffs and wffs, intuitively denoting "this formula can be derived from these other formulas".
There are several different ways of defining the $\vdash$ relation of classical first-order logic, such as by a Hilbert system, or by natural deduction or sequent calculus (or for that matter we can define it to be identical to semantic entailment after developing some rudimentary model theory). These all define one and the same logic, at least according to one way of looking at it.
A first order theory is a particular set of axioms whose consequences (according to first-order logic) you're interested in.
In other words, a "first-order theory" consists of (a) a first-order vocabulary, from which the generic first-order logic rules derive a concept of "well-formed formula", and (b) a particular set of well-formed formulas over that vocabulary, which we we call the "axioms" of the theory.
Classical first-order logic shares its rules for how well-formed formulas look with some other logics, such as intuitionistic first-order logic. So if we have a first-order theory, we can -- at least in principle -- also consider which formulas can be derived from its axioms under the intuitionistic entailment rule. But in practice that rarely work well because when people design first-order theories to be used with the classical rules, they don't generally care to distinguish between formulations of the axioms that are classically equivalent but intuitionistically distinct. So the set of intuitionistic consequences of a theory that's made to be interpreted by classical logic can be somewhat random.
Best Answer
See page 56 : Definition I.5.9 (Soundness). A theory $\Gamma$ is sound iff, for all $\mathcal A ∈ \text {Wff}$ $\Gamma \vdash \mathcal A$ implies $\Gamma \vDash \mathcal A$.
To say that a first order theory $\Gamma$ is sound (page 58) means that the theory $\Gamma$ does not prove false formulae.
Thus (by contraposition): if $\Gamma \nvDash \varphi$, then $\Gamma \nvdash \varphi$.
But $\nvDash \lnot x=x$: the formula $x \ne x$ is clearly not valid. And thus, by soundness, the calculus cannot prove it, i.e. $\nvdash \lnot x=x$ (because every theorem of the calculus is also a provable in a theory $\Gamma$ whatever).