What I'd like to say is, to some degree, commentary and expansion on what some people have said in the comments above.
I feel that the answer to (Q1), about whether there is a "high-concept" explanation for the dual Steenrod algebra, is "no" at the current point in time. Even further, I feel like the attempt to do so right now would be misleading. We can assign some high-concept descriptions to phenomena in terms of formal group laws, but trying to give a high-concept reason why the Steenrod algebra takes the form it does would be misinterpreting the role of the Steenrod algebra.
We construct the stable homotopy category from the category of topological spaces. Based on its construction, we can say a lot of things in older and more modern language. It's a tensor triangulated category; it's the homotopy category of a symmetric monoidal stable model category; it's the homotopy category of a stable $\infty$-category; it's some kind of "universal" way to construct a stable category out of the category of topological spaces. Based on this, we can describe a lot of the foundational properties of the stable homotopy category and the category of spectra. However, there are a ton of categories that satisfy basically the same properties. Without doing more work, we don't understand anything about specifics that distinguish the stable homotopy category from any other example.
These kinds of computations originate with the Hurewicz theorem, the Freudenthal suspension theorem, Serre's computation of the cohomology of Eilenberg-Mac Lane spaces and his method for computing homotopy groups iteratively, and Adams' method of stepping away from Postnikov towers and upgrading Serre's method into the Adams spectral sequence. Before these, you would have no reason to necessarily believe that the stable homotopy category isn't equivalent to, say, the derived category of chain complexes over $\mathbb{Z}$ or some other weird differential graded algebra. Before you have these, you don't have Milnor's computation of the homotopy groups of $MU$ or $MU \wedge MU$, and you don't have Quillen's interpretation in terms of the Lazard ring.
In short, understanding the Steenrod algebra and its dual are prerequisites for all the qualitative things we understand about the stable homotopy category to distinguish it from another example. Right now there is not a door to the stable homotopy category that comes from formal-group data, as much as we would like one. As a result, I guess I feel like trying to assign a high-concept description to the Steenrod algebra is backwards right now.
The fact that often computations come first and the conceptual interpretations second is something that gives the subject some of its flavor. Are there high-concept explanations why vector bundles should have Stiefel-Whitney classes? Why complex vector bundles should have Chern classes? Why the "quadratic" power operations in mod-$2$ cohomology should generate all cohomology operations, and why all the relations are determined by those coming from composing two?
This is definitely not to say that such a description wouldn't be desirable. It would be very desirable to have a more direct route from concepts to the stable homotopy category, because constructing objects that realize conceptual descriptions can be very difficult.
Yes, and no. There is a "categorically comprehensive" reason for this trace map to exist, but not necessarily to construct it. And to prove that this reason is valid, one does require categorical machinery. The elevator speech answer is:
THH is an algebra in some symmetric monoidal category. K is the unit in this symmetric monoidal category, so THH receives a unique algebra map from K theory.
(If you are looking for a less theoretical reason, and want some explicit constructions, you might take a look at Kantorovitz-Miller, "An explicit description of the Dennis trace map.")
Everything I write below, I learned from Blumberg-Gepner-Tabuada, "Uniqueness of the multiplicative cyclotomic trace."
First, we note that both THH and K define functors
$\infty Cat^{perf} \to Spectra$.
On the righthand side is the category of spectra. (Take any model you like, so long as it's not the homotopy category of the model. You can take Lurie's oo-categorical model, or symmetric spectra if you like.)
On the lefthand side is the category of perfect stable oo-categories. Roughly, these are the categories that look like modules over some ring spectrum. A different way you might describe this category is as follows: The category of spectrally enriched categories, localized with respect to Morita equivalence.
Note that both categories--$\infty Cat^{perf}$ and $Spectra$--have a symmetric monoidal structure. The latter has the usual smash product, while the former has the tensor product of stable $\infty$-categories. This is given by the cocompletion of the following naive tensor product: Given two categories $A$ and $B$, the objects of $A \otimes^{naive} B$ are pairs of objects $(a,b)$, and the hom spectrum between $(a,b)$ and $(a',b')$ is given by $hom(a,a') \wedge hom(b,b')$.
Moreover, we note that both THH and K satisfy the following properties:
They are lax monoidal. (In fact, THH is symmetric monoidal.) This means that we have specified natural maps $K(A) \otimes K(B) \to K(A \otimes B)$, but these need not be equivalences.
They are localizing: If we have a short exact sequence of categories $A \to B \to C$, we have a cofibration sequence of spectra $K(A) \to K(B) \to K(C)$, and likewise for THH. The proof of this for THH can be found in Blumberg-Mandell ("Localization theorems in topological Hochschild homology and topological cyclic homology").
Now, consider the category of all functors $\infty Cat^{perf} \to Spectra$ satisfying (2). One can construct a symmetric monoidal structure on this category. And it turns out that any functor further satisfying (1) can be made into an $E_\infty$ algebra in this category, and that K theory is in fact the unit! Since THH satsifies (1) and (2), the corresponding algebra for THH receives a unique algebra map from K theory.
When $A$ is an $E_\infty$ ring, then $THH(A)$ is an $E_\infty$ ring as well; by the algebra map from $K$ to $THH$, one obtains an $E_\infty$ ring map $K(A) \to THH(A)$.
More generally, if $A$ is an $E_n$-algebra, then $K(A) = K(AMod)$ is an $E_{n-1}$ ring. This is because the category of $A$-modules can be given an $E_{n-1}$-structure, and $K$ theory is lax monoidal. Moreover, you can also prove that $THH(A)$ has an $E_{n-1}$ structure as well (you can see this also using factorization homology, for instance). The fact that there is an algebra map $K \to THH$ implies that one also obtains an $E_{n-1}$-algebra map $K(A) \to THH(A)$.
Best Answer
Alright, here is the promised answer. First, the Hopkins-Mahowald theorem states that $\mathbb{F}_p$ is the free $\mathbb{E}_2$-ring with $p=0$, i.e. it is the homotopy pushout in $\mathbb{E}_2$-rings of $S^0 \leftarrow \mathrm{Free}_{\mathbb{E}_2}(x) \rightarrow S^0$ where we have $x \mapsto 0$ for the first arrow and $x \mapsto p$ for the second arrow. Equivalently (after $p$-localization), if we let $S^1 \to \mathrm{BGL}_1(S^0_{(p)})$ be adjoint to $1-p \in \pi_0(S^0)_{(p)}^{\times}$, extend to a double loop map $\Omega^2S^3 \to \mathrm{BGL}_1(S^0_{(p)})$ and take the Thom spectrum, we get $\mathrm{H}\mathbb{F}_p$. At the prime 2 there are many references, and the original proof was due to Mahowald- it appears, for example, in his papers "A new infinite family in $\pi_*S^0$" and in "Ring spectra which are Thom complexes". At odd primes this is due to Hopkins-Mahowald, but it's hard to trace down the "original" reference- it appears, for example, in Mahowald-Ravenel-Shick "The Thomified Eilenberg-Moore Spectral Sequence" as Lemma 3.3. In both cases the result is stated in the Thom spectrum language. And, again, let me reiterate: this is a non-formal result and essentially equivalent to Bokstedt's original computation. One always needs to know Steinberger's result (proved independently in Bokstedt's manuscript) that $Q_1\tau_i = \tau_{i+1}$ mod decomposables in $\mathcal{A}_*$, where $Q_1$ is the top $\mathbb{E}_2$-operation. (I suppose one could get away with slightly less: one needs to know that $\tau_0$ generates $\mathcal{A}_*$ with $\mathbb{E}_2$-Dyer-Lashof operations).
Anyway, given this result, let's see how to compute THH. In Blumberg-Cohen-Schlichtkrull (https://arxiv.org/pdf/0811.0553.pdf) they explain how to compute THH of any $\mathbb{E}_1$-Thom spectrum, with increasingly nice answers as we add multiplicative structure. For an $\mathbb{E}_2$-Thom spectrum (i.e. one arising from a double loop map) we have that $\mathrm{THH}(X^{\xi}) \simeq X^{\xi} \wedge BX^{\eta B\xi}$ where $\eta B\xi$ is the composite $BX \to B(BGL_1(S^0)) \stackrel{\eta}{\to} BGL_1(S^0)$. So, in our case we have that $THH(\mathbb{F}_p) \simeq \mathbb{F}_p \wedge (\Omega S^3)^{\eta B\xi}$. Thom spectra are colimits, and smash products commute with those, so we can compute this smash product as the Thom spectrum of $BX \to B(BGL_1(S^0)) \to BGL_1(S^0) \to BGL_1(\mathbb{F}_p)$. But $\eta \mapsto 0$ in $\pi_*\mathrm{H}\mathbb{F}_p$ so this is the trivial bundle, and we deduce the result we're after:
$\mathrm{THH}(\mathbb{F}_p) \simeq \mathbb{F}_p \wedge \Omega S^3_+ \simeq \mathbb{F}_p[x_2]$.
It's worth giving the same argument again but in a more algebraic way, and, as a bonus, you can basically see why the Blumberg-Cohen-Schlichtkrull result holds.
The point (which is very well explained in a more general setting in Theorem 5.7 of Klang's paper here: https://arxiv.org/pdf/1606.03805.pdf) is that $\mathbb{F}_p$, as a module over $\mathbb{F}_p \wedge \mathbb{F}_p^{op}$, is obtained by extending scalars from $S^0$ as an $S^0[\Omega S^3_+]$-module.
Indeed, start with the pushout of $\mathbb{E}_2$-algebras $S^0 \leftarrow \mathrm{Free}_{\mathbb{E}_2}(x) \to S^0$ and smash with $\mathbb{F}_p$ to get a pushout of $\mathbb{E}_2$-$\mathbb{F}_p$-algebras. But now both maps in the pushout are the augmentation, and we see that the map $\mathbb{F}_p \wedge \mathbb{F}_p \simeq \mathbb{F}_p \wedge \mathbb{F}_p^{op} \to \mathbb{F}_p$ is equivalent to the map $\mathrm{Free}_{\mathbb{E}_2-\mathbb{F}_p}(\Sigma x) \to \mathbb{F}_p$ which is just the augmentation. This is tensored up from the map $\mathrm{Free}_{\mathbb{E}_2}(\Sigma x) \to S^0$, so we get that $\mathrm{THH}(\mathbb{F}_p) \simeq \mathbb{F}_p \otimes_{\mathbb{F}_p \wedge \mathbb{F}_p} \mathbb{F}_p \simeq S^0 \otimes_{\mathrm{Free}_{\mathbb{E}_2}(\Sigma x)} \mathbb{F}_p \simeq S^0 \otimes_{\mathrm{Free}_{\mathbb{E}_2}(\Sigma x)} S^0 \otimes_{S^0} \mathbb{F}_p$.
The left hand factor is given by $\mathrm{Free}_{\mathbb{E}_1}(\Sigma^2x)$, which you can see in various ways. For example, this bar construction is the suspension spectrum of the relative tensor product in spaces $* \otimes_{\Omega^2 S^3} *$, which is the classifying space construction and yields $B(\Omega^2S^3) \simeq \Omega S^3$.