Please correct me if I am wrong, but as far as I can tell, you are a beginning abstract algebra student. To me, many of your prognostication questions amount to "why wasn't it immediately obvious to me to notice things that took hundreds of years and work by really smart mathematicians to build up?" I'm just an undegrad studying math, but one thing that has been consistently true throughout my mathematical education is that patience is essential.
Yes, a lot of the proof methods you will see in algebra are very clever. That's because most of the proofs you are reading have been around for a very long time, and a lot of very accomplished mathematicians have had a chance to comb over them and tighten them into very neat and efficient proofs. This has benefits and drawbacks from a pedagogical standpoint; in a sense, while the proof is likely much more readable, it has the look and feel of being pulled out of thin air. It seems to me that this is what you are struggling with.
To that end, I think two things will help you. One is, as I mentioned above, patience. In all but rare circumstances, you will start to understand and anticipate these kinds of proofs much more rapidly the more problems you do and the more mathematics you read. There is simply no rushing this. It's a matter of familiarity. If you keep doing lots of examples, I can assure you, the key insights in a lot of these problems will demystify a bit - in part because because you've seen them before, and in part because your intuition and ingenuity will have begun to develop.
The other is that when you are reading a proof, take a moment to read the statement. Try proving the proposition in question yourself first. One big thing I (and many others) advocate for is to do sample computations to verify that the statement you're trying to prove is true. This is great because in doing sample computations, you often get a really good sense of what you might need to do to prove something is true in general, but you can get your hands on something concrete. If you can't prove something, read a few lines of the proof. Once you see where it is going, try to finish the proof yourself. Repeat as necessary. I think this will certainly help in making proofs seem a bit more motivated.
Regarding your specific questions:
1&3. How did someone know that the union of two subgroups is a subgroup under precisely these conditions? Well, it's not so simple. At some point, mathematicians sat down and probed the question of when a union of two subgroups is a subgroup. They likely looked at many examples, thought about the definitions, and as mathematicians are wont to do, noticed patterns, and made a conjecture thereof. Perhaps the first conjectures they made were wrong, and in trying to prove them, counterexamples arose. Perhaps after refining their conjectures, the "only if" statement was added. I am guessing here; I don't know exactly what led to this proof. This will be true of many of your prognostication questions. Like I said though, I think it will help to try to prove things yourself first. If you ask yourself the questions that lead to these proofs, like when the union of two subgroups is a subgroup, perhaps the statements themselves will seem less mysterious.
2&4. Proof by contradiction has nothing to do with negativity. It is a general method of proof, and there is no real pattern (as far as I can tell) for when one should use it. Things like proof by contradiction, proof by contrapositive, etc. are ways of looking at logically equivalent formulations of statements that may be easier to prove. Sometimes knowing the best way to prove something is trial and error. If proving something directly seems really hard, proving something by contradiction may make things substantially easier, and therefore be the way to go. It's a matter of saying, "Hmm, suppose this weren't true. Does anything weird happen from that?" In this case, we can see that if our statement isn't true, we can use multiplication by $hk$ or $kh$ to show something weird happens.
I really hope this post doesn't come off as condescending, because it isn't meant to in the least. Hopefully, however, it helps in answering the general question of "prognostication" that has appeared in many of your posts.
(1.) I don't understand your argument, you need to be more clear
(2., 3.) A general way to produce a counterexample is to consider any group with a non-normal subgroup $H$, then try to obtain a normal subgroup $N$ by "adding" elements to $H$. Thus by construction $H$ is not normal, $N$ is normal, and $H \cap N = H$ is not normal.
In particular, since $G \trianglelefteq G$, we may simply let $N = G$. There is nothing special about $S_3$, it was probably chosen because it is the smallest group with non-normal subgroups.
Counterexample 2 uses a proper normal subgroup (only for educative purposes I guess, again letting $N = D_4$ would have been easier); I think the only thing you can envision here is why $N$ is normal.
Best Answer
The intersection of two groups $(U,\odot_U)$ and $(V,\odot_V)$ is first, and foremost, the set $$ G = U \cap V \text{.} $$ To turn this into a group, one would need to define a suitable operation $\odot_G$ on $G$. But where is that operation supposed to come from? Since $U$ and $V$ can be completely different groups, which just happen to be constructed over two non-disjoint sets $U$ and $V$, it's not at all obvious how $\odot_G$ is supposed to be defined. Thus, the intersection of two groups is merely a set, not a group. The same holds of course for the union of two sets - again, where would the operation come from that turns the union into a group?
In fact, since we generally consider groups only up to isomorphisms, i.e we treat two groups $G_1,G_2$ as the same group $G$ if they only difference between the two is the names of the elements, the union or intersection of two groups isn't even well-defined. For any pair of groups $U,V$ we can find some set-theoretic representation of $U$ and $V$ such that $U \cap V = \emptyset$, and another such that $U \cap V \neq \emptyset$.
Now constrast this with the situation of two subgroups $U,V$ of some group $(H,\odot_H)$. In this case, we know that $\odot_U$ and $\odot_V$ are simply the restrictions of $\odot_H$ to $U$ respectively $V$, and the two operations will therefore agree on the intersection of $U$ and $V$. So we can very naturally endow the set $$ G = U \cap V $$ with the operation $$ \odot_G = \odot_H\big|_{U \cap V} = \odot_U\big|_{U \cap V} = \odot_V\big|_{U \cap V} \text{.} $$