All condescension aside, my first thought was that, in fact, category theory is an incredibly useful tool and language. As such, many of us want to read CWM so that we can understand various constructions in other fields (for instance the connection between monadicity and descent, or the phrasing of various homotopy theory ideas as coends, not to mention just basic pullbacks, pushforwards, colimits, and so on). So it is in fact relevant WHY you want to read it.
As an undergraduate, I started reading CWM, with minimal success. The idea being primarily that, as you say, I had very few examples. I thought the notion of a group as a category with one element was rather neat, but I couldn't really understand adjunctions, over(under)-categories, colimits or some of the other real meat of category theory in any deep, meaningful fashion until I began to have some examples to apply.
In my opinion, it is not fruitful to read CWM straight up. It's like drinking straight liquor. You might get really plastered (or in this analogy, excited about all the esoteric looking notation and words like monad, dinatural transformation, 2-category) but the next day you'll realize you didn't really accomplish anything.
What is the rush? Don't read CWM. Read Hatcher's Algebraic Topology, read Dummit and Foote, read whatever the standard texts are in differential geometry, or lie groups, or something like that. Then, you will see that category theory is a lovely generalization of all the nice examples you've come to know and love, and you can build on that.
$Cat(A,B)$ is the set of functors from the category $A$ to the category $B$. In general, $\mathcal{C}(A,B)$ denotes the hom-set between objects $A,B$ in the category $\mathcal{C}$. Since there is a "category of categories", something like $Cat(A \times B, C) \cong Cat(A, C^B)$ is saying that there is a bijection between functors, sending $F:A\times B \rightarrow C$ to the functor $G: A \times C^B$. (This is exactly like saying $$Maps(A \times B, C) \cong Maps(A, C^B)$$ in $Sets$, except the functions are now functors!) What is $G$? Well, $C^B$ is a functor category. The objects are functors, and so given an object $a$ of $A$, $G(a)$ had better be a functor from $B$ to $C$. What functor do we take? The only option is $F(a,-)$. But $G$ also needs to do something to morphisms. So a morphism $k:a \rightarrow a'$ needs to go to a natural transformation $F(a,-) \rightarrow F(a',-)$. But this comes built in to the definition of $F$, since it's natural in both slots. We can also go back. Given such a $G$, we can build an $F$, and do it in a way that the two constructions are mutually inverse.
Now, you can then view $Cat(A,B)$ as a category (a functor category), with objects the morphisms and morphisms as natural transformations between them. The above construction doesn't "come with" a way to send morphisms (natural transformations) to morphisms. But given a natural transformation between bifunctors $S \Rightarrow T:A \times B \rightarrow C$, I would believe that one could construct a natural transformation between the adjoints, which I'll denote $\hat{S}$ and $\hat{T}$. If $\eta$ is the natural transformation from $S$ to $T$, then define $\hat{\eta}$ at $a$ to be... what? Well, it needs to go from $\hat{S}(a)$ to $\hat{T}(a)$. These are objects in the functor category $Cat(B,C)$, so $\hat{S}(a)$ is a functor from $B$ to $C$, as is $\hat{T}(a)$. So $\eta_a$ should be a natural transformation! What should the natural transformation be? Well, I would use $\eta(a,-)$.
I don't have any reason to think that this $\hat{\cdot}$ operation won't induce a bijection on natural transformations, and so those two functor categories are indeed isomorphic, which you could write as $$C^{A\times B} = (C^B)^A.$$ But you can probably imagine that writing out the details is a little tedious! The trick is to at all times carefully keep track of what type of objects you're working with, what the morphisms should be, etc.
Note that naturality of the bijection $Cat(A\times B,C) \cong Cat(A,C^B)$ didn't enter with this argument, since we never had to think about any categories other than $A,B,$ and $C$. So what does that mean? Go back to the thing that took a functor $F:A \times B \rightarrow C$ and spat out a functor $G:A \rightarrow C^B$. Suppose you had a functor $H:A' \rightarrow A$. You can use this to build a functor $H\times 1:A' \times B \rightarrow A \times B$. Suppose you wanted to know what happens when you applied all this stuff to $F(H\times 1):A' \times B \rightarrow C$. You have to go through all that computation over again!
But that's where naturality saves the day. Naturality says that if you've computed one thing, you can compute pre- and post- compositions by just... composing! So the result from applying all this to $F(H \times 1)$ is just $GH: A' \rightarrow A \rightarrow C^B$. As you read the book, you'll notice that naturality is a very powerful condition, since it imposes so many relations on the transformation!
Best Answer
What is meant by: "the universal property of the canonical projection $p:G \to G/N$", is that if we have $f \in \mathrm{Hom}(G,G')$ such that $f(N) = \{e_{G'}\}$, then there exists a unique $f'$ with $f = f'\circ p$. This is just the First Isomorphism Theorem in disguise, the $f'$ in question is: $f'(gN) = f(g)$, which is well-defined since $f(n) = e_{G'}$ for all $n \in N$ (so if $gN = g'N$ then $g' = gn$ for some $n \in N$, thus $f(g') = f(gn) = f(g)f(n) = f(g)e_{G'} = f(g)$). This is often paraphrased as "$f$ factors through $p$".
So the "normal way" of showing $G/N \cong (G/M)/(N/M)$ is to show that $f(gN) = (gM)(N/M)$ is a well-defined isomorphism. But let's look at this another way:
The map $p_N: G \to G/N$ is a group morphism that kills $M$ (since $M \subset N$), so $p_N$ factors through the map $p_M: G \to G/M$ so that $p_N = f \circ p_M$, for some morphism $f$. Note that this morphism $f$ goes from $G/M$ to $G/N$ and kills $N/M$, so it in turn factors through $p_{N/M}:G/M \to (G/M)/(N/M)$, that is $f = f' \circ p_{N/M}$. But we also have the map $p_{N/M} \circ p_M$, which kills $N$, so there is a morphism $k$ such that $p_{N/M} \circ p_M = k \circ p_N$.
So $k \circ f' \circ p_{N/M} \circ p_M = k \circ f \circ p_M = k \circ p_N = p_{N/M} \circ p_M$, that is: $k \circ f' = \mathrm{id}_{(G/M)/(N/M)}$.
Similarly, $f' \circ k \circ p_N = f' \circ p_{N/M} \circ p_M = f \circ p_M = p_N$, so that $f' \circ k = \mathrm{id}_{G/N}$ (using implicitly the fact that the projections are epimorphisms to justify the cancellation).
These two facts together imply that $k$ and $f'$ are inverses, and thus isomorphisms. The important thing about all of this, is that we never mentioned any of the elements of $G$, or even any cosets, just homomorphisms between various groups.