I myself find pure category theory for its own sake rather difficult to swallow, and prefer to think of it with actual examples of its use. So, let me give a few examples (historical) of how the abstraction of category theory led to significant mathematical advances.
There is the theory of etale cohomology. Etale cohomology is a variant of the standard sheaf cohomology one encounters in algebraic geometry courses.* The starting point is a categorical observation: the sheaf axioms are, fundamentally, functorial; a sheaf on a topological space $X$ is a contravariant functor from the category of open sets of $X$ (with the morphisms the inclusions) that satisfies a certain exactness property. When interpreted in this way, it is possible to talk about sheaves on a general category with a suitable notion of covering (i.e., a Grothendieck topology). If one uses different categories (for instance, the Zariski site gives regular sheaf cohomology, but the etale site gives etale cohomology) one can get different cohomology theories. (Incidentally, as another example, in the theory of the etale fundamental group, Grothendieck developed an abstract approach to Galois theory that not only clarifies the analogy between Galois theory and the classification of covering spaces, but allows one to construct purely categorically an algebraic $\pi_1$.)
In homotopy theory, Quillen's language of model categories unified the ideas behind the homotopy theory of simplicial sets and the homotopy theory of topological spaces. In other words, to do homotopy theory in this language, one simply needs a category with suitable structure on it (maps designated as cofibrations, fibrations, and weak equivalences; these are supposed to abstract the notions of Serre cofibration, fibration, and weak homotopy equivalence and satisfy lifting properties), and from this alone one can construct the homotopy category. Doing so allowed Quillen to efficiently find new examples of model categories, which one might not immediately associate with "homotopy theory," such as the model category of simplicial commutative rings; with this, and an abstract definition of homology, he was able to construct the so-called cotangent complex and thus the Andre-Quillen cohomology of a ring (which had been conjectured by Grothendieck).
Simplicial sets themselves can be viewed purely combinatorially: they are a sequence of sets $X_n$ with suitable boundary and degeneracy maps, and this is all one needs. But for a human, this sequence of notation is somewhat formidable
and un-intuitive; it is much cleaner to use the language of categories and say that they are (contravariant) functors from the category of finite ordered sets to the category of sets. This allows one to easily construct things like the standard $n$-simplex $\Delta[n]$ and see its universal property (because it is just a consequence of general categorical nonsense, Yoneda's lemma). One benefit of thinking in a categorical manner is that, although I know very little about this, there is actually a general theory (apparently developed by Cisinski) of constructing model structures on presheaf categories.
In mathematics, it frequently happens that an object will parametrize a family of things in some way. For instance, the Hilbert scheme parametrizes closed subschemes of a projective scheme, while projective space itself parametrizes line bundles together with a set of generators; there are numerous more examples. In each, it is a little tricky state exactly what "parametrizes" really means: the elegant approach is to say that some given functor is representable. In other words, it is to say that some functor $F$ can be realized as maps into some object $X$, which is the "universal" parametrizing object. It is often of interest to give some specific criteria for a general functor to be representable (and herein is the essence of the categorical approach; proving representability individually for one concrete functor is a task that could, a priori, be formulated without appeal to category theory).
In algebraic topology, a rather spectacular result (the Brown representability theorem) states that anything that looks kind of like cohomology (in particular, any extraordinary cohomology theory) is representable on the homotopy category, at least if you stick to CW complexes. This is really a sweeping result because it applies to a very large class of functors.**
(In algebraic geometry, I am not aware of any such strong sufficiency conditions. On the other hand, there are fairly stringent necessary conditions that any representable functor on the category of schemes must satisfy---such functors must be sheaves in suitable Grothendieck topologies (cf. 1 above). This in practice is a type of descent condition.)
*I think one reasonably argue that even the introduction of sheaf cohomology was a revolution of the categorical approach: sheaf cohomology is (most generally) defined as a derived functor on the category of sheaves, but a derived functor on an abelian category, not something which is obviously a category of modules. (The notion of deriving functors in an abelian category was, if I am not mistaken, introduced in Grothendieck's Tohoku paper.)
**One interesting application of this is to the case of singular cohomology itself. The implication is that if $X$ is a CW complex, then there is a fixed space $K(G, n)$ (for each abelian group $G$ and $n \in \mathbb{Z}$) such that homotopy classes of maps $X \to K(G, n)$ are naturally in bijection with cohomology classes in $H^n(X, G)$. From this it follows that $K(G, n)$ can have only one nonvanishing homotopy group, and one gets a consequence of this categorical nonsense the Eilenberg-Maclane spaces. (In fairness, I should probably point out that, for instance, Hatcher's construction of the Eilenberg-Maclane spaces is basically a toy analog of the proof of Brown representability.)
Finally, one major advantage of the categorical philosophy (which I have already hinted at) is that it allows one to reuse ideas. Some ideas, like Yoneda's lemma or the idea of a universal property, take a little while to digest, but they show up so amazingly often, across diverse mathematical disciplines, that it's just more efficient to prove it once in maximal generality than re-doing a special case of it over and over. Perhaps one reason for this is that so many of the constructions one encounters in mathematics (the tangent bundle to a smooth manifold, the singular (co)homology or homotopy groups of a topological space, the tensor product of modules (or rings), the operation of base-change in algebraic geometry) are ultimately functors.
All condescension aside, my first thought was that, in fact, category theory is an incredibly useful tool and language. As such, many of us want to read CWM so that we can understand various constructions in other fields (for instance the connection between monadicity and descent, or the phrasing of various homotopy theory ideas as coends, not to mention just basic pullbacks, pushforwards, colimits, and so on). So it is in fact relevant WHY you want to read it.
As an undergraduate, I started reading CWM, with minimal success. The idea being primarily that, as you say, I had very few examples. I thought the notion of a group as a category with one element was rather neat, but I couldn't really understand adjunctions, over(under)-categories, colimits or some of the other real meat of category theory in any deep, meaningful fashion until I began to have some examples to apply.
In my opinion, it is not fruitful to read CWM straight up. It's like drinking straight liquor. You might get really plastered (or in this analogy, excited about all the esoteric looking notation and words like monad, dinatural transformation, 2-category) but the next day you'll realize you didn't really accomplish anything.
What is the rush? Don't read CWM. Read Hatcher's Algebraic Topology, read Dummit and Foote, read whatever the standard texts are in differential geometry, or lie groups, or something like that. Then, you will see that category theory is a lovely generalization of all the nice examples you've come to know and love, and you can build on that.
Best Answer
Eilenberg and MacLane published "General theory of equivalences" in 1945. The initial idea of the two mathematicians was to provide an autonomous framework for the concept of natural transformation, which they came across while working on analogies between group extensions and homology groups and whose generality, pervasiveness and usefulness became soon clear to both of them. For this reason, their initial project turned into that one of devising an axiomatic system in which natural transformations would arise naturally from a stable and self-consistent theoretic grounding (the subject later called category theory, of course). After they realized that a natural transformation is nothing but a family of maps providing a sort of "deformation" between two "collections of interrelated entities" within a given structure, they introduced what are now called functors, which played the role of the "deformation" of a natural transformation. The collection of interrelated entities" was soon formalized through the definition of category. (by the way, notice anyway that MacLane and Eilenberg explicitly avoided using a set-theoretical terminology and notation!!!) You can find more information on papers like "The history of categorical logic: 1963-1977" by Marquis and Reyes (which is extremely recent, by the way), and of course in Kroemer's book.