Here are some further possibilites:
Propositional logic: can be interpreted in any Boolean algebra (assuming you are talking about classical logic here), but perhaps your notion of truth assignment allows for truth values as elements of Boolean algebras. Alternatively, via the Curry-Howard correspondence propositions may be interpreted as types and proofs as terms of those types.
Predicate logic (first-order and higher-order): can in general be interpreted in suitable categories or fibrations. For example, higher-order logic can be interpreted in a topos (which must be Boolean if your logic is classical).
In general, if you are looking for interpretations of various kinds of logic in ways other than first-order model theory, you should look at categorical logic. There various possibilities (multi-sorted vs. single-sorted, intuitionistic vs. classical, etc) are systematically considered. Also, the ad-hoc distinction between "full" vs. "Henkin-style" semantics (which depends on the idea that there is a "true" set theory in the background, as far as I can tell) is replaced with a much more advanced notions (such as properties of fibrations that are used to interpret logic). A place to start looking would be Bart Jacob's "Categorical logic and type theory", although it's not an easy reading. There should be some lecture notes on the internet that are more accessible, perhaps someone can suggest them?
I am not sure what you mean by "valid interpretation". If you just mean soundness (provable things are valid) then all of the possibilities mentiond are "valid". But I suspect you mean something else.
Your error is simply assuming that there is only one kind of higher-order logic (HOL). Contrary to your last paragraph, there are at least 2 major kinds, corresponding to 2 different semantics:
In full semantics, once you fix the domain (of individuals), you can no longer choose the interpretation of the higher-order sorts (starting with the subsets of individuals), because that is determined by the iterated power-sets of the domain (more precisely what the meta-system thinks are the power-sets). By the incompleteness theorem for PA, even second-order logic (SOL) with full semantics cannot be captured by a computable formal system.
In general semantics, you not only must choose the domain but also must choose the interpretation of each higher-order sort, and typically the only restriction is that it must respect the comprehension axioms that you want to have. For example if you want to have full higher-order comprehension, then you have the axiom ( $∃x∈S_{k+1}\ ∀t∈S_k\ ( t∈x ⇔ φ(t) )$ ) for each (higher-order) property $φ$ with one parameter from $S_k$, where $S_k$ denotes the level-$k$ sort. In SOL we usually require just predicative comprehension, namely where $φ$ only quantifies over $S_0$ (i.e. the individuals), which correspond to the first-order definable subsets of the domain.
Note that HOL with general semantics can be captured by multi-sorted FOL, and hence by one-sorted FOL (simply by using predicate-symbols to represent the sorts). It is in this precise technical sense that we "do not lose much by confining our attention to first order languages only". But it is wrong to infer that "it would yield a translation of equal length". Even merely extending an FOL system to the SOL system with predicative comprehension is equivalent to having internal definitorial expansion, which allows one to avoid repetition of long formulae.
Note that multi-sorted FOL (and the one-sorted translation) is semantically complete for HOL with general semantics, but of course not semantically complete for HOL with full semantics. Moreover, the Henkin construction shows that if a countable HOL system $T$ (i.e. with countable language) has a general model, then it also has a countable general model.
Furthermore, the speedup theorem you linked to does not actually show a speedup attributable to the use of HOL over FOL! Note that the proof in that SEP article uses additional assumptions, namely the higher-order induction schema instead of just the first-order induction schema. That speedup is due to the ω-incompleteness of PA rather than anything to do with HOL, as shown clearly by the proof. This is also obvious from the fact that if PA proves Q then PA+Q is still an FOL system and proves Q with a very short proof...
Best Answer
This is a long list of questions! These are all related to a certain extent, but you might consider breaking it up into separate questions next time.
Proof theorists tend to prefer systems with many rules and few axioms such as natural deduction systems and Gentzen systems. The reason is that these are much easier to manipulate and visualize. (Look at the statement and proof of the Cut Elimination Theorem for an illustrative example.) Model theorists tend to prefer lots of axioms and just modus ponens, like Hilbert systems. The reason is that the semantics of such systems are easy to handle, and semantics is what model theorists really care about.
The real difference between propositional and first-order logic is quantification. Quantifiers are naturally more expressive than logical connectives.
There are many, many flavors of higher-order logic. The two main classifications are set-based systems and function-based systems. There are variants which don't really fit this division and there are variants which mix both. With strong enough internal combinatorics, these are all equivalent.
The set-based system have a set A of type 0 objects, which are considered atomic. Type 1 objects are elements of the powersets ℘(An). When the base theory has an internal pairing function (like arithmetic and set theory), the exponent n can be dropped. Then, type 2 objects are elements of the second powerset ℘(℘(A)), and similarly for higher types. Set-based higher types are somewhat inconvenient without an internal pairing function.
Function-based systems are similar, with functions An→A as type 1. Again, with an internal pairing function, the higher types streamline to A → A, (A → A) → A, ((A → A) → A) → A, etc. However, it is common to use more complex types in these systems.
Be careful how you read Gödel's Incompleteness Theorem. For first-order logic, the hypotheses state that the theory in question must be recusively enumerable and that it must be powerful enough to interpret a reasonable amount of arithmetic. Those are important hypotheses. (And they explain why no such theorem exists for propositional logic.) There are many variants for higher-order systems, but you need to be even more careful when stating them.
You can appreciate the difference between the different levels of higher-order logic by reading about Gödel's Speed-Up Theorems.
Most of mathematics today is based on set theory, at least in a theoretical sense. In set theory, higher types are interpreted internally using powersets and sets of functions. They also extend transfinitely and this transfinite hierarchy of higher types was shown necessary to prove theorems very low in the hierarchy, such as Martin's Borel Determinacy Theorem.
You may want to read on categorical logic.