The most immediately obvious relation to category theory is that we have a category consisting of types as objects and functions as arrows. We have identity functions and can compose functions with the usual axioms holding (with various caveats). That's just the starting point.
One place where it starts getting deeper is when you consider polymorphic functions. A polymorphic function is essentially a family of functions, parameterised by types. Or categorically, a family of arrows, parameterised by objects. This is similar to what a natural transformation is. By introducing some reasonable restrictions we find that a large class of polymorphic functions are in fact natural transformations and lots of category theory now applies. The standard examples to give here are the free theorems.
Category theory also meshes nicely with the notion of an 'interface' in programming. Category theory encourages us not to look at what an object is made of, but how it interacts with other objects, and itself. By separating an interface from an implementation a programmer doesn't need to know anything about the implementation. Similarly category theory encourages us to think about objects up to isomorphism - it doesn't precisely proclaim which sets our groups comprise, it just matters what the operations on our groups are. Category theory precisely captures this notion of interface.
There is also a beautiful relationship between pure typed lambda calculus and cartesian closed categories (CCC). Any expression in the lambda calculus can be interpreted as the composition of the standard functions that come with a CCC: like the projection onto the factors of a product, or the evaluation of a function. So lambda expressions can be interpreted as applying to any CCC. In other words, lambda calculus is an internal language for CCCs. This is explicated in Lambek and Scott. This means for instance that the theory of CCCs is deeply embedded in Haskell, because Haskell is essentially pure typed lambda calculus with a bunch of extensions.
Another example is the way structural recursion over recursive datatypes can be nicely described in terms of initial objects in categories of F-algebras. You can find some details here.
And one last example: dualising (in the categorical sense) definitions turns out to be very useful in the programming languages world. For example, in the previous paragraph I mentioned structural recursion. Dualising this gives the notions of F-coalgebras and guarded recursion and leads to a nice way to work with 'infinite' data types such as streams. Working with streams is tricky because how do you guard against inadvertently trying to walk the entire length of a stream causing an infinite loop? The appropriate dual of structural recursion leads to a powerful way to deal with streams that is guaranteed to be well behaved. Bart Jacobs, for example, has many nice papers in this area.
If you want to teach something intriguing, you should do something that introduces a new geometric idea while also involving algebra in an essential way. I recommend that you give an introduction to the projective plane, showing the other students that it is a natural extension of ordinary space which makes some geometric properties more uniform (such as intersection properties of curves), gives a fruitful new way to think about old topics (like asymptotes), and lets you do things that are impossible to conceive without it (reducing rational points mod $p$). There should be substantial interplay between algebra and geometry, but make sure to draw pictures to emphasize the geometric aspects.
In algebra, we can conceive of the quadratic formula in a uniform manner, but the ancient Greeks [Edit: Babylonians, not Greeks] couldn't do this because they didn't have the idea of negative numbers. So they had several quadratic formulas on account of not being able to write something as simple as $ax^2 + bx + c = 0$ at one stroke (for any signs on $a, b$, and $c$, with $a$ nonzero). Our extended skill at algebra lets us work with one case where the ancients had to take multiple cases. We can also say with complex numbers that any quadratic equation has two roots, allowing for a double root to count as one root with multiplicity two.
The thrust of what comes next is to extend the plane so that geometric properties become nicer in a similar way the algebra is becoming nicer when we use more general number systems.
Consider the intersection properties of lines in the plane. There is a dichotomy: usually two lines in the plane meet in one point, but some pairs of lines (the parallel ones) meet in no points. Let's see what this looks like under stereographic projection. Lines in the plane become circles through the north pole, but not including the north pole itself. It's natural to close up the image and take that whole circle as a substitute for the original line. So we can see that lines in the plane naturally close up into circles through the north pole. Under stereographic projection, the old dichotomy between parallel and non-parallel lines takes on a new appearance: a pair of non-parallel lines corresponds under stereographic projection to a pair of circles intersecting in two different points, one of which is the north pole, while a pair of parallel lines corresponds under stereographic projection to a pair of circles which are tangent at the north pole. It is natural to think of two tangent circles as having their point of tangency be an intersection point of multiplicity two, much like a quadratic polynomial can have a double root. So after stereographic projection we can "see" two points of intersection for any pair of lines. This geometric construction is something like the algebraic use of more general number systems to find roots to all quadratic equations. The moral to take from this example is that in a larger space, curves that used to not intersect may now intersect (or rather, their natural closures in the new space intersect) with a uniform count of the number of intersection points. If the students agree that enlarging number systems to create solutions to polynomial equations is good, they should agree that enlarging space to make intersection properties more uniform is good too. Another important feature is that the sphere, like the plane, is a homogeneous object: we can transform (rotate) the space to carry one point to any other point. On the sphere as a space in its own right, there is truly nothing special about the north pole.
An even better geometric extension of the plane is the projective plane, although at first it will feel unfamiliar and strange because you can't see it all at once.
You should introduce it in a uniform manner as points described with homogeneous coordinates $[x,y,z]$ where $x$, $y$, and $z$ are not all 0 and, say,
$$
[3,6,2] = [1,2,2/3] = [1/2,1,1/3] = [3/2,3,1] \text{ and } [0,5,0] = [0,1,0].
$$
Although it is impossible to see the whole projective plane at once, we can get glimpses of large parts of it using three different charts: $U_0$ is the points where $x \not= 0$, $U_1$ is the points where $y \not= 0$ and $U_2$ is the points where $z \not= 0$. These three charts together cover the projective plane. Any nonzero coordinate can be scaled to 1 and that fixes the other two homogeneous coordinates of the point, e.g., $[x,y,1] = [x',y',1]$ if and only if $x = x'$ and $y = y'$. This means we can identify each of $U_0$, $U_1$, and $U_2$ with the usual plane (e.g., identify $U_2$ with ${\mathbf R}^2$ by identifying $[x,y,1]$ with $(x,y)$). This means the projective plane locally looks like the plane, much like the sphere does, except we can't see all of it at the same time as we can with the sphere.
(In case you want to show students that the projective plane is a really natural model of something they have known in another context, think about nonzero ideals in ${\mathbf R}[x]$. Any ideal has a generator, but the polynomial generator is only defined up to a nonzero scaling factor. Usually we normalize the generator to be monic, but if we don't want to insist on a particular choice of generator then the right model for the generator is a point in projective space. In particular, for any nonzero ideal $(f(x))$ where $\deg f(x) \leq 2$, write $f(x) = ax^2 + bx + c$; the coefficients $a, b, c$ are only defined up to an overall scaling factor, so the point $[a,b,c]$ is one way to think about that ideal.)
Next introduce curves in the projective plane as solutions to homogeneous polynomial equations in $x$, $y$, and $z$ and explain what the algebraic process of homogenization and dehomogenization of polynomials is, e.g., it makes $y = 2x + 1$ into $y = 2x + z$ or $x^2 - y^2 = x+ 1$ into $x^2 - y^2 = xz + z^2$. In particular a line in the projective plane is the solution set to any equation $ax + by + cz = 0$ where the coefficients are not all 0.
Now let's look at what a point on a specific curve in the projective plane looks like in each of the three standard charts, carry out the same kind of calculus computation in each chart, and compare the results with each other. We will use the curve $C : x^2 + y^2 = z^2$ in the projective plane (not to be confused with a surface in 3-space given by the same equation) and the points $P = [3,4,5]$ and $Q = [1,0,1]$ which lie on $C$. How do $C$, $P$, and $Q$ appear in each of the charts $U_0$, $U_1$, and $U_2$?
a) In $U_0$, which is identified with the plane by $[x,y,z] \mapsto (y/x,z/x)$, $C$ becomes the hyperbola $z^2 - y^2 = 1$, $P$ becomes $(4/3,5/3)$, and $Q$ becomes $(0,1)$. Here we identify $U_0$ with the usual $yz$-plane. By calculus, the tangent line to $z^2 - y^2 = 1$ at the point $(4/3,5/3)$ is $z = (4/5)y + 3/5$ and the tangent line at $(0,1)$ is $z = 1$.
Note that we actually miss two points from $C$ when we look at the intersection of it with $U_0$: $[0,1,\pm 1]$.
b) In $U_1$, $C$ becomes the hyperbola $z^2 - x^2 = 1$ in the $xz$-plane, $P$ becomes the point $(3/4,5/4)$ with tangent line $z = (3/5)x + 4/5$, and $Q$ doesn't actually live in this chart (kind of like the north pole under stereographic projection not going to anything the in the plane). Here two points from $C$ are missing: $[1,0,\pm 1]$.
c) In $U_2$, $C$ becomes the circle $x^2 + y^2 = 1$, $P$ becomes $(3/5,4/5)$ with tangent line $y = (-3/4)x + 5/4$, and $Q$ becomes $(1,0)$ with tangent line $x = 1$. Every point from $C$ lies in $U_2$, so no points are missing here. We see the "complete" curve in this chart.
It is essential to draw three pictures here (of the $yz$-plane, $xz$-plane, and $xy$-plane) and mark $P$ and $Q$ in each (except you don't see $Q$ in the $xz$-plane).
Now comes the beautiful comparison step: in all three charts the homogenization of the tangent line at $P$ is exactly the same equation: $3x + 4y = 5z$. The tangent line at $Q$ in $U_0$ and $U_2$ homogenizes in both cases back to $x = z$. This suggests there should be an intrinsic concept of tangent line in the projective plane to the curve $C$ at the points $P$ and $Q$, and you can compute the tangent line by looking at any chart containing the relevant point of interest, doing calculus there, and then homogenizing back. The homogenization of your ordinary linear equation to a homogenuous linear equation will always be the same, and its solutions in the projective plane define the tangent line to the projective curve at that point.
As further evidence of the consistency of this new space and the geometry in it, look at the intersections of the two tangent lines at $P$ and $Q$: in $U_0$ -- the $yz$-plane -- the tangent lines meet in $(1/2,1)$ while in $U_2$ -- the $xy$-plane -- the tangent lines meet in $(1,1/2)$. These points both homogenize back to the same point $[2,1,2]$, which is the unique (!) point in the projective plane satisfying $3x + 4y = 5z$ and $x = z$.
Remember that $Q$ went missing in the chart $U_1$? Well, its tangent line did not go missing: the projective line $x = z$ in the projective plane meets the chart $U_1$ in the ordinary line $x = z$ of the $xz$-plane, which is an asymptote to the piece of $C$ we can see in $U_1$. This is really amazing: asymptotes to (algebraic) curves in the usual plane are "really" the tangent lines to missing points on the complete version of that curve in the projective plane. To see this from another point of view, move around $C$ clockwise in the chart $U_2$ (where it's a circle) and figure out the corresponding motion along the piece of $C$ in the chart $U_0$ (where it's a hyperbola): as you pass through the point $Q = (1,0)$ in $U_2$, what happens in the chart $U_0$ is that you jump off one branch of the hyperbola onto the other branch by skipping through an asymptote, sort of. (There is a second point on $C$ in $U_2$ that you don't see in $U_0$ -- the point $R = [-1,0,1]$ is $(-1,0)$ in $U_2$ -- and paying attention to that point may help here.)
The conic sections -- parabolas, hyperbolas, and ellipses -- which look quite different in ${\mathbf R}^2$, simplify in the projective plane because they all look like the same kind of curve (once you close them up): $y = x^2$ becomes $yz = x^2$, $xy = 1$ becomes $xy = z^2$, and $x^2 + y^2 = 1$ becomes $x^2 + y^2 = z^2$, which is the same as $x^2 = (z-y)(x+y) = z'y'$, where $z' = z-y$ and $y' = z+y$. I like to think about this as a fancy analogue of the Greek [Edit: Babylonian] use of many forms of the quadratic formula because they didn't have the right algebraic technique to realize there is one quadratic formula. Using the projective plane we see there is really one conic section.
You might want to show by examples the nicer intersection properties of lines in the projective plane: any two lines in the projective plane meet in exactly one point. This is just a glimpse of the fact that curves in the projective plane have nicer intersection properties than in the ordinary plane, but to get the correct theorem in that direction for curves other than lines, you need to (a) work over the complex numbers and (b) introduce an appropriate concept of intersection multiplicity for intersection points of curves, somewhat like the idea of tangent circles intersecting in a point of multiplicity two which I mentioned earlier. The relevant theorem here is Bezout's theorem, but to state it correctly is complicated precisely because it is technical to give a good definition of what the intersection multiplicity is for two curves meeting at a common point.
For the student who wants to be a number theorist, compare reduction mod $p$ in the usual plane and the projective plane. In the study of Diophantine equations (e.g., to show $y^2 = x^3 - 5$ has no integral solutions), it is very useful to reduce mod $p$, and there is a natural way to reduce a point in ${\mathbf Z}^2$ modulo $p$ However, there's no reasonable way to reduce all points in ${\mathbf Q}^2$ modulo $p$: when the rational numbers have denominator divisible by $p$, you can't make sense of them mod $p$: we can reduce $(-7/4,51/8)$ mod 5, for example, but not mod 2. In the projective plane, however, we can reduce rational points mod $p$ by the idea of choosing a set of primitive integral coordinates, where the homogeneous coordinates are relatively prime. For example, $[-7/4,51/8,1] = [-14,51,4]$ in ${\mathbf P}^2({\mathbf Q})$, and this can be reduced mod $p$ for any $p$ at all. For example, in ${\mathbf P}^2({\mathbf F}_2)$ it becomes $[0,1,0]$.
(There is another primitive set of homogeneous coordinates for the point, namely $[14,-51,-4]$, but that reduces mod $p$ to the same thing as before, so this reduction mod $p$ process is well-defined.) This suggests that the projective plane has better mapping properties than the usual plane, in some sense.
Best Answer
There's a big difference between teaching category theory and merely paying attention to the things that category theory clarifies (like the difference between direct products and direct sums). In my opinion, the latter should be done early (and late, and at all other times); there's no reason for intentional sloppiness. On the other hand, teaching category theory is better done after the students have been exposed to some of the relevant examples.
Many years ago, I taught a course on category theory, and in my opinion it was a failure. Many of the students had not previously seen the examples I wanted to use. One of the beauties of category theory is that it unifies many different-looking concepts; for example, left adjoints of forgetful functors include free groups, universal enveloping algebras, Stone-Cech compactifications, abelianizations of groups, and many more. But the beauty is hard to convey when, in addition to explaining the notion of adjoint, one must also explain each (or at least several) of these special cases. So I think category theory should be taught at the stage where students have already seen enough special cases of its concepts to appreciate their unification. Without the examples, category theory can look terribly unmotivated and unintuitive.