I actually haven't read Kunen. But I have taught axiomatic set theory before. The basics you should know are probably the completeness and compactness theorems, the knowledge of Łoś theorem is probably also there somewhere. These already encompass a lot of basic model theory and logic.
From the recursion theory parts, you should know what does it mean that $\Gamma\vdash\varphi$ is a decidable statement. This means that you need to understand coding of first-order logic into the natural numbers, and all the mechanism. Knowledge of the incompleteness theorem usually encompasses all of this, and more which is necessary from a philosophical point of view to understand certain more advance parts of set theory (e.g. large cardinal axioms).
From the little I've glanced through Kunen, he is sometimes overly formalistic to my taste. This means that he would bring up things from the second paragraph above more than I'd care for as a reader, and from a pedagogical point of view I find them to be a burden on the student. I know, however, several people who would argue the opposite point of view. So this is really up to you decide what you prefer.
To that end, if you are like me, and you still insist on reading Kunen (rather than one of the alternatives), you might as well work backwards. When you run into a statement you don't understand, look it up, and read "backwards" until you fill in the blanks. Not everyone can do it, but you can give it a go.
Just know that all in all, there are no shortcuts. In the end, the things you don't do properly, you end up forgetting really fast and it ends up a waste of your time in general.
Yes, because the standard formalism for forcing that Kunen uses starts with a countable transitive model of $\sf ZFC$, rather than just any countable model, more care is needed. Whether there is a transitive model of $\sf ZFC$ is independent from $\sf ZFC + Con(ZFC),$ though once you have one, you can get a countable transitive model by Lowenheim-Skolem + Mostowski collapse.
On the other hand, the reflection schema allows you to construct, in $\sf ZFC$ alone, a countable transitive model of any finite subtheory of $\sf ZFC$. Analyzing the forcing formalism, if we have an arbitrary finite fragment $T$ of $\sf ZFC + \lnot CH,$ it's possible to find a finite fragment $T'$ of $\sf ZFC$ such that we can, working in $\sf ZFC,$ construct a countable transitive model $M$ of $T'$ and show the generic extension $M[G]$ satisfies $T.$
If $\sf ZFC + \lnot CH,$ were inconsistent, there would be some finite fragment witnessing this, and then the above procedure would produce a contradiction in $\sf ZFC$ alone. So, as a bonus, this is no mere infinitary proof of $\sf Con(ZFC)\to Con(ZFC+\lnot CH)$ using model theory within $\sf ZFC$, but rather a finitary syntactic proof of $\sf Con(ZFC)\to Con(ZFC+\lnot CH)$.
On a side note, it is possible to carry out an independence proof that starts out like you suggested. Note that in the transitive approach, the construction of $M[G]$ uses the (external) well-foundedness of $M$ in an essential way. An analogous construction in the not-necessarily-well-founded case is possible, but more complicated$^*$. Between that complication and the fact that transitive models are nicer to work with for all sorts of reasons, the transitive models approach is generally favored as an introduction, despite the mild metamathematical hiccups it causes.
$^*$ One approach that works is to construct the relevant Boolean-valued model inside $M$ and then take a quotient of it by any ultrafilter. But then again, Boolean-valued models already produce a nice relative consistency proof on their own. See also this answer for a brief description, and JDH's answer to the same question for more on the BVM approach.
Best Answer
The key issue is the choice of pairing function. Contra our usual experiences, in this case the way we implement pairing is quite crucial, and we need to use an approach whose existence is quite non-obvious: a "constructibly flat" pairing function.
A flat pairing function is a pairing function which does not increase rank unnecessarily: that is, such that $$rank(\langle a,b\rangle)=\max\{rank(a),rank(b),\omega\}$$ (we need to throw $\omega$ in their for cardinality reasons: there just aren't enough sets of rank $n$ for $n<\omega$ to have flatness at finite ranks). The usual example of a flat pairing function is the Quine-Rosser pairing function (but there are others).
Here we need a twist on flatness, which luckily Quine-Rosser also satisfies: namely, that $L_\alpha$ is closed under it whenever $\alpha$ is infinite. That is, we want a pairing function which doesn't increase $L$-rank (instead of $V$-rank).
This lets us construct "finite type objects over $L_\alpha$" in $L_{\alpha+1}$; for example, using the Quine-Rosser approach we have that each definable function $L_\alpha\times L_\alpha\rightarrow L_\alpha$ literally is an element of $L_{\alpha+1}$. The objects we want in this case are witnesses to satisfaction. There are various ways to approach the definition of "$F$ witnesses that $\varphi$ is true in $L_\omega$," including:
Skolem functions.
Appropriate subtrees of the "big syntax tree" of $\varphi$ ("big" in the sense that we unfold all quantifiers as infinite conjunctions/disjunctions).
Winning strategies in the "truth game."
The point is that each of the above approaches can be implemented in $L_{\omega+1}$: if $L_\omega\models\varphi$ then a witness to this exists in $L_{\omega+1}$. This then lets us built the set of all $\varphi$ for which witnesses to $L_\omega\models\varphi$ exist at $L_{\omega+2}$.
This leaves aside the issue of actually defining the relation "$F$ witnesses that $\varphi$ is true in $L_\omega$" (whichever of the approaches above we take). However, while technically the hard bit this isn't particularly counterintuitive - the naive approach (brute force) actually works. The counterintuitive feature was the implementation of ordered pairs in such a way that the $L$-rank could be kept low, which ironically was the technically trivial-ish part.