Is this explanation of normal subgroups and quotient groups correct

abstract-algebragroup-theoryintuitionnormal-subgroupsquotient-group

I apologize for the long post, but I'm currently a student finishing up his first semester in group theory. My introduction was pretty definition-heavy so I've found I can internalize concepts (such as quotient groups, normal subgroups, etc.) myself by forming my own way of motivating and teaching them intuitively. I'd like to know if my current presentation/understanding is correct.

I think after learning about subgroups and Lagrange's theorem, a natural question is then if we can break down a group $G$ to better understand its parts and hopefully the whole $G$ (as is tradition in any analytical endeavor). But if we want to pull back any useful understanding of $G$ from this smaller group, it ought to preserve some structure of $G$. That structure is exactly how the operation acts on elements of $G$ (since groups are just elements with an operation relating them).

So for the sake of exploration, we pretend to have the magic function $\phi : G \rightarrow H$ that does exactly this for us–maps $(G, *_G)$ to some smaller part $(H, *_H)$, then ask what we can say about $\phi$. Our original goal was for $\phi$ to preserve the operation, i.e. for all $a, b \in G$ that $\phi(a *_G b) = \phi(a) *_H \phi(b)$.

The next thing I would observe is that since $H$ has smaller order, $\phi$ necessarily maps a multiple elements, say $a, b$, in $G$ to the same element in $H$. In this sense, $a$ and $b$ are "equivalent" under $\phi$. Given that we have "willed" $\phi$ to operation-preserving, we can see that a natural way this arises by letting $ak = b$ for some $k \in G$:

$$ak = b \implies \phi(b) = \phi(a *_G k) = \phi(a) *_H \phi(k).$$

If we want $\phi(b) = \phi(a) *_H \phi(k) = \phi(a)$ then $\phi(k) = e_H$. I think leads naturally to the definition of the kernel: it's a set of elements that maps to the identity, and makes $a$ equivalent to $b$ mod $\ker \phi$. And in fact, as I've learned from this answer, we naturally get equivalence classes of elements that partition the group into cosets analogous to modular arithmetic. So, $\phi$ takes elements and puts them neatly into these equivalence classes (abstracting away some of the details in $G$ that look "the same" in $H$, leading–at least for me–directly to the First Isomorphism Theorem). Then it makes sense to propose the map $\phi : g \mapsto g \ker \phi$.

The next question becomes what the operation of this $H$ looks like. We've established that elements of $H$ are cosets (and equivalence classes), so for two elements in cosets $g_1 \ker\phi$ and $g_2 \ker\phi$, once combined by $*_H$ we'd want for the result to be in $g_1g_2 \ker\phi$ (modular arithmetic analogy works here as well). Set-wise, we might write

$$g_1\ker\phi \cdot g_2\ker\phi = g_1g_2\ker\phi.$$

But does this come for free? For an element $g_1k_1g_2k_2 \in g_1\ker\phi \cdot g_2\ker\phi$ to look like $g_1g_2k$ for some $k$, it must be that $k_1g_2 = g_2k_3$ for some $k_3$. Set-wise this can be written as $g\ker\phi = \ker\phi g$, i.e. left-costs = right-cosets and it turns out it indeed satisfies this condition and we are safe to proceed.

So, in the end, we have designed $H$, a broken-down version of $G$. And how did we do it? By "dividing out" or "quotienting out" the information that looks the same under $\phi$ in $H$$\ker \phi$. Thus we write $H$ as $G/\ker\phi$, aptly called a quotient group.

Although, you could flip this presentation, and instead of viewing from the kernel perspective, suppose $K$ is some arbitrary group. Then it must satisfy the condition of left-cosets = right-cosets (which we name normality because it is a nontrivial property that gives us a usable quotient) for $G/K$ to be a group, as $\ker\phi$ already does, and through satisfying normality automatically becomes the kernel of some homomorphism (namely the natural, which I've presented).

My questions are:

  • Is this presentation correct (on an intuitive level, I know there are lots of places for concrete proofs)? It feels right to me, but I also feel like I may have gotten definitions vs. implications mixed up.
  • If so, does any textbook follows this approach that I can dig into?
  • I think homomorphisms also can fit into this framework, given I suggest $\phi$ pretty early on, but how would non-surjective homomorphisms be explained?

Best Answer

I know you've gotten one satisfactory answer, but let me weigh in here a bit.

I'll first mention that what you present is generally correct, and a valid way to approach this. The idea of "breaking down a (finite) group into smaller pieces" is in fact behind the idea of classifying finite simple groups (groups that cannot be broken down), together with the theory of group extensions (trying to understand what a group $G$ "is" if you have a normal subgroup $N\triangleleft G$, and you understand both $N$ and $G/N$).

But let me offer you a different perspective and a different way into the isomorphism theorems...

After you learn about groups, and subgroups and Lagrange's Theorem, maybe Cauchy's Theorem, we come to a crossroads in how to try to better understand a given group.

One way to try to learn things about a given group $G$ is to just stare at it until you notice some interesting things about $G$. However, generally speaking, a much more fruitful approach in algebra is to take a less static approach and to consider two things: what the group $G$ "can do", and how it interacts with other groups.

What a group "can do" is in fact historically how groups were originally understood. The original notion of a group was a "group of permutations": a collection of operations acting on a set in specific ways. Even as late as the turn of the 20th Century, Burnside's book on groups still defines a group as a collection of "operators" acting on "some objects". It was Cayley who introduced the abstract definition of a group as a "set with a binary associative operations satisfying certain conditions", and then immediately went on to prove that this did not change the objects of study, as any "group of permutations" was a group under his new proposed definition, and any object that satisfied this new proposed definition could be understood as a "group of permutations". This is the notion behind Cayley's Theorem, and why it is, in my opinion, more important historically than practically today. But this already introduces the notion of functions: what does "can be understood as a group of permutations" mean? It means you can biject it with such a group in a way that respects the operation.

This also leads us to functions. To justify why we want to think about functions, let's consider two areas where functions play a major role: the real numbers/calculus, and linear algebra.

The key property of the real numbers was that they were "continuous": they have no 'holes'. Rather than just stare at real numbers and see if we can say interesting things about them, it turns out to be much more fruitful and interesting to consider functions from $\mathbb{R}$ to itself that respect this "continuity". And so we get the notion of continuous functions, and the study of continuous functions, as a way to shed light on the nature of the real numbers themselves.

Similarly, with Linear Algebra, staring at vector spaces only takes you so far; the real power of vector spaces only emerges when you start considering linear transformations.

In both cases, you don't just want any old function; you want functions that "preserve" whatever it is that makes your objects interesting. For real numbers, continuity; for vector spaces, the addition and scalar product.

So with groups. A group is characterized by three things (bear with me): a binary operation $G\times G\to G$, that assigns to any pair of elements $g_1,g_2\in G$ their "product" $g_1g_2$. A distinguished element $e_G\in G$ with the property that $ge_G=e_Gg=g$ for all $g\in G$. And a function $G\to G$ that assigns to every element $g\in G$ its "inverse", $g^{-1}$, which has the property that $gg^{-1}=g^{-1}g=e_G$.

So if we have two groups $G$ and $H$, then a "function that preserves this structure" would be a function $f\colon G\to H$, such that

  1. Respects products: if $g_1,g_2\in G$, then $f(g_1g_2) = f(g_1)f(g_2)$.
  2. Respects the identity: $f(e_G) = e_H$.
  3. Respects inverses: if $g\in G$, then $f(g^{-1}) = (f(g))^{-1}$.

It turns out that two of these conditions are superfluous, but that is how we want to start. As you know, if 1 holds for a function between groups, then 2 and 3 will automatically hold as well. One can then define a group homomorphism as simply a function that satisfies 1 and prove 2 and 3; I prefer to define it as a function that satisfies 1, 2, and 3, and then prove that if it satisfies 1, then it must satisfy 2 and 3. The reason I prefer is that I think it makes the definition more natural.

Okay, so these are the functions that will play the role of "linear transformations" and "continuous functions". We call them, as I mentioned above, "group homomorphisms." They also are the type of functions needed in Cayley's argument that any group "can be seen" as a group of permutations, because that corresponds to a one-to-one function $f\colon G\to S_X$ (for some set $X$), that satisfies 1, 2, and 3. So that $f(G)$ is "essentially the same" (as far as the group structure is concerned) as $G$, but now it consists of permutations on a set $X$.

Now, given a function $f\colon G\to H$ (in fact, any function between two sets $X$ and $Y$), there is a natural equivalence relation that we can define on $G$. Let us say that two elements $x,y\in G$ are "$f$-equivalent", $x\sim_f y$, if and only if $f(x)=f(y)$. This is easily verified to be an equivalence relation, and so it partitions $G$ into equivalence classes.

But because $f$ is a group homomorphism, we have the following consequences: if $x\sim_f y$ and $z\sim_f w$, then $xz\sim_f yw$, and $x^{-1}\sim_f y^{-1}$. So we can make the set of equivalence classes, $G/\sim_f$ into a group! Let $[x]_f$ be the equivalence class of $x$. Then we can define $[x]_f*[y]_f = [xy]_f$, $e_{G/\sim_f} = [e_G]_f$, and $([x]_f)^{-1} = [x^{-1}]_f$. It is then an easy exercise to show that this is indeed a group.

What relation does this group have with $G$ and with $f$? Well, that's the first isomorphism theorem: there is a bijective group homomorphism between the group $G/\sim_f$ and the image group $f(G)$, given by sending $[x]_f$ to $f(x)$.

How is this related to normal subgroups? Ah, well, these equivalence classes have an interesting property: because of the property that $x\sim_f y$ implies $x^{-1}\sim_f y^{-1}$, and if $x\sim_f y$ and $z\sim_f w$ then $xz\sim_f yw$, we have $$x\sim_f y \iff xy^{-1}\sim_f e_G.$$ That is: we can completely determine the equivalence relation by just knowing $[e_G]_f$. Moreover, this collection is a subgroup of $G$!

Will any subgroup work? No, it turns out it doesn't. If $N$ is a subgroup and we try to define an equivalence relation $x\sim y$ if and only if $xy^{-1}\in N$, we do get an equivalence relation, but we do not get an equivalence relation that lets you define a group structure on $G/\sim$. The condition that lets you do that is precisely that $N$ must be a normal subgroup. I go into much more detail about this in this answer.

So, "good" equivalence relations, those coming from functions (they are called "congruences"), correspond to normal subgroups. In fact, as with Cayley's Theorem before (which gave a separate definition of "group" and then showed it was really the same as the old one), so it is with "good" equivalence relations:

Theorem. A subgroup $N$ of a group $G$ is normal in $G$ if and only if there exists a group $H$ and a homomorphism $f\colon G\to H$ such that $N=[e_G]_f$.

This then leads to the usual First Isomorphism Theorem, which says: this construction of taking quotients of a group is "essentially the same" as looking at the image of $G$ under a group homomorphism, in that given any homomorphism $f\colon G\to H$, if $N=[e_G]_f$, then $G/N = G/\sim_f$ is "essentially the same" as $f(G)$: there is a bijective group homomorphism between them.

The Third Isomorphism Theorem corresponds to compositions of morphisms: if $f\colon G\to H$ and $g\colon H\to K$, then $f(G)/\sim_g$ is essentially the same as $g\circ f(G)/\sim_{g\circ f}$. That is, if $N\triangleleft G$, $K\triangleleft G$, $N\subseteq G$, then $K/N \triangleleft G/N$ and $(G/N)/(K/N)\cong G/K$.

The Fourth, or lattice, Isomorphism Theorem establishes a correspondence between the subgroups of $f(G)$ and the subgroups of $G$ that contain $[e]_f$. One then asks... okay, and what about other subgroups of $G$? That's what the Second Isomorphism Theorem gives you: if $f\colon G\to H$ is a homomorphism, and $K$ is an arbitrary subgroup of $G$, then $f(K)$ corresponds to $K/(K\cap N)$ (where $N=[e]_f$). And this image is "the same" as the image of $KN$. That is, $$\frac{K}{K\cap N} \cong \frac{KN}{N}.$$

So in summary:

  1. First Isomorphism Theorem tells you that images of groups correspond to quotients and vice-versa.

  2. Third Isomorphism Theorem tells you that this correspondence plays well with composition.

  3. Fourth Isomorphism Theorem tells you that there is a very nice correspondence between the subgroups of $f(G)$ and the subgroup of $G$ that contain $[e]_f$.

  4. And the Second Isomorphism Theorem tells you how the rest of the subgroups of $G$ behave under the homomorphism $f$.

Thus, the importance of normal subgroups corresponds simply to the importance of homomorphisms. Images of a group are like "shadows" of the group, and so will hopefully sometimes be easier to understand. Simple groups are the ones that we cannot simplify this way: we'll just have to stare at them intently until we understand them. And if we can understand simple groups, and we can understand how to put groups together (group extensions) from $N$ and $G/N$, then perhaps we can leverage our (hypothetical) understanding of simple groups into a (even more hypothetical) understanding of all groups. Turns out this is too naïve a hope, unfortunately, but perhaps it can help justify why we care about morphisms, normal subgroups, quotients, etc.

Related Question