There are many many reasons why kernels are important to group theory, but here's just one way of appreciating the kernel in a fairly isolated context.
If we zoom out a bit, any set-function $f: A \to B$ (here $A$ and $B$ are simply sets) naturally partitions $A$ into equivalence classes, and for $a \in A$, the equivalence class of $a$ is given by
$$[a] = \{a' \in A : f(a') = f(a)\};$$
the set of all elements of $A$ that get mapped to the same thing as $a$. The same logic applies if $f : G \to H$ isn't just a set function, but a homomorphism of groups.
With the equivalence class notation, the kernel of $f$ is simply the equivalence class of the identity $1_G$ of $G$,
$$\ker f = [e_G],$$
since any homomorphism $f: G \to H$ always sends the identity $e_G$ of $G$ to the identity $e_H$ of $H$. What can we say about arbitrary $g, g' \in G$ such that $f(g) = f(g')$? That is, what can we say about the equivalence class $[g]$ for any $g \in G$?
Claim: For a homomorphism $f: G \to H$ and $g \in G$, we have $f(g) = f(g')$ if and only if there exists some $k \in \ker f$ such that $gk = g'$; that is, $g$ and $g'$ differ by a multiple of something in the kernel of $f$. In particular, $[g] = \{gk: k \in \ker f\}$, and has size $|\ker f|$.
($\Longrightarrow$) Supposing $f(g) = f(g')$, note that there exists a unique $g^* = g^{-1}g' \in G$ such that $gg^* = g'$. Then
$$f(g) = f(g') = f(gg^*) = f(g)f(g^*),$$
and left-multiplication by $f(g)^{-1}$ shows that $e_H = f(g^*)$, hence $g^* \in \ker f$.
($\Longleftarrow$) Homework.
If you've ever heard homomorphisms described as functions that "respect" the group operation(s), the size of the kernel is a measure of just how "respectful" a given homomorphism is! A large kernel means that more of the structure of the group $G$ is "ignored" when transported to the group $H$.
Edit:
For "respectful", imagine two situations, considering $S_3$, the symmetric group of degree $3$. There's a sign homomorphism $\operatorname{sgn} : S_3 \to \{-1, 1\} = C_2$ to the multiplicative group $C_2$ sending each permutation to its sign. Its kernel is the alternating group $A_3 = \{1, (123), (132)\}$ of "even" permutations, and in the image $C_2$, almost all of the structure of $S_3$ is ignored; we forget everything but whether a permutation is even or odd.
On the other hand, we have a "copy" of $S_3$ as a subgroup of $S_4$ if we consider all permutations of $S_4$ that leave $4$ fixed. This leads to an "inclusion" homomorphism $\iota: S_3 \to S_4$, sending each permutation to its "copy" in $S_4$. This inclusion homomorphism has only the identity of $S_3$ in its kernel, and is considerably more "respectful" than the sign homomorphism; every bit of information about $S_3$ shows up in the image $\iota(S_3)$.
I know you've gotten one satisfactory answer, but let me weigh in here a bit.
I'll first mention that what you present is generally correct, and a valid way to approach this. The idea of "breaking down a (finite) group into smaller pieces" is in fact behind the idea of classifying finite simple groups (groups that cannot be broken down), together with the theory of group extensions (trying to understand what a group $G$ "is" if you have a normal subgroup $N\triangleleft G$, and you understand both $N$ and $G/N$).
But let me offer you a different perspective and a different way into the isomorphism theorems...
After you learn about groups, and subgroups and Lagrange's Theorem, maybe Cauchy's Theorem, we come to a crossroads in how to try to better understand a given group.
One way to try to learn things about a given group $G$ is to just stare at it until you notice some interesting things about $G$. However, generally speaking, a much more fruitful approach in algebra is to take a less static approach and to consider two things: what the group $G$ "can do", and how it interacts with other groups.
What a group "can do" is in fact historically how groups were originally understood. The original notion of a group was a "group of permutations": a collection of operations acting on a set in specific ways. Even as late as the turn of the 20th Century, Burnside's book on groups still defines a group as a collection of "operators" acting on "some objects". It was Cayley who introduced the abstract definition of a group as a "set with a binary associative operations satisfying certain conditions", and then immediately went on to prove that this did not change the objects of study, as any "group of permutations" was a group under his new proposed definition, and any object that satisfied this new proposed definition could be understood as a "group of permutations". This is the notion behind Cayley's Theorem, and why it is, in my opinion, more important historically than practically today. But this already introduces the notion of functions: what does "can be understood as a group of permutations" mean? It means you can biject it with such a group in a way that respects the operation.
This also leads us to functions. To justify why we want to think about functions, let's consider two areas where functions play a major role: the real numbers/calculus, and linear algebra.
The key property of the real numbers was that they were "continuous": they have no 'holes'. Rather than just stare at real numbers and see if we can say interesting things about them, it turns out to be much more fruitful and interesting to consider functions from $\mathbb{R}$ to itself that respect this "continuity". And so we get the notion of continuous functions, and the study of continuous functions, as a way to shed light on the nature of the real numbers themselves.
Similarly, with Linear Algebra, staring at vector spaces only takes you so far; the real power of vector spaces only emerges when you start considering linear transformations.
In both cases, you don't just want any old function; you want functions that "preserve" whatever it is that makes your objects interesting. For real numbers, continuity; for vector spaces, the addition and scalar product.
So with groups. A group is characterized by three things (bear with me): a binary operation $G\times G\to G$, that assigns to any pair of elements $g_1,g_2\in G$ their "product" $g_1g_2$. A distinguished element $e_G\in G$ with the property that $ge_G=e_Gg=g$ for all $g\in G$. And a function $G\to G$ that assigns to every element $g\in G$ its "inverse", $g^{-1}$, which has the property that $gg^{-1}=g^{-1}g=e_G$.
So if we have two groups $G$ and $H$, then a "function that preserves this structure" would be a function $f\colon G\to H$, such that
- Respects products: if $g_1,g_2\in G$, then $f(g_1g_2) = f(g_1)f(g_2)$.
- Respects the identity: $f(e_G) = e_H$.
- Respects inverses: if $g\in G$, then $f(g^{-1}) = (f(g))^{-1}$.
It turns out that two of these conditions are superfluous, but that is how we want to start. As you know, if 1 holds for a function between groups, then 2 and 3 will automatically hold as well. One can then define a group homomorphism as simply a function that satisfies 1 and prove 2 and 3; I prefer to define it as a function that satisfies 1, 2, and 3, and then prove that if it satisfies 1, then it must satisfy 2 and 3. The reason I prefer is that I think it makes the definition more natural.
Okay, so these are the functions that will play the role of "linear transformations" and "continuous functions". We call them, as I mentioned above, "group homomorphisms." They also are the type of functions needed in Cayley's argument that any group "can be seen" as a group of permutations, because that corresponds to a one-to-one function $f\colon G\to S_X$ (for some set $X$), that satisfies 1, 2, and 3. So that $f(G)$ is "essentially the same" (as far as the group structure is concerned) as $G$, but now it consists of permutations on a set $X$.
Now, given a function $f\colon G\to H$ (in fact, any function between two sets $X$ and $Y$), there is a natural equivalence relation that we can define on $G$. Let us say that two elements $x,y\in G$ are "$f$-equivalent", $x\sim_f y$, if and only if $f(x)=f(y)$. This is easily verified to be an equivalence relation, and so it partitions $G$ into equivalence classes.
But because $f$ is a group homomorphism, we have the following consequences: if $x\sim_f y$ and $z\sim_f w$, then $xz\sim_f yw$, and $x^{-1}\sim_f y^{-1}$. So we can make the set of equivalence classes, $G/\sim_f$ into a group! Let $[x]_f$ be the equivalence class of $x$. Then we can define $[x]_f*[y]_f = [xy]_f$, $e_{G/\sim_f} = [e_G]_f$, and $([x]_f)^{-1} = [x^{-1}]_f$. It is then an easy exercise to show that this is indeed a group.
What relation does this group have with $G$ and with $f$? Well, that's the first isomorphism theorem: there is a bijective group homomorphism between the group $G/\sim_f$ and the image group $f(G)$, given by sending $[x]_f$ to $f(x)$.
How is this related to normal subgroups? Ah, well, these equivalence classes have an interesting property: because of the property that $x\sim_f y$ implies $x^{-1}\sim_f y^{-1}$, and if $x\sim_f y$ and $z\sim_f w$ then $xz\sim_f yw$, we have
$$x\sim_f y \iff xy^{-1}\sim_f e_G.$$
That is: we can completely determine the equivalence relation by just knowing $[e_G]_f$. Moreover, this collection is a subgroup of $G$!
Will any subgroup work? No, it turns out it doesn't. If $N$ is a subgroup and we try to define an equivalence relation $x\sim y$ if and only if $xy^{-1}\in N$, we do get an equivalence relation, but we do not get an equivalence relation that lets you define a group structure on $G/\sim$. The condition that lets you do that is precisely that $N$ must be a normal subgroup. I go into much more detail about this in this answer.
So, "good" equivalence relations, those coming from functions (they are called "congruences"), correspond to normal subgroups. In fact, as with Cayley's Theorem before (which gave a separate definition of "group" and then showed it was really the same as the old one), so it is with "good" equivalence relations:
Theorem. A subgroup $N$ of a group $G$ is normal in $G$ if and only if there exists a group $H$ and a homomorphism $f\colon G\to H$ such that $N=[e_G]_f$.
This then leads to the usual First Isomorphism Theorem, which says: this construction of taking quotients of a group is "essentially the same" as looking at the image of $G$ under a group homomorphism, in that given any homomorphism $f\colon G\to H$, if $N=[e_G]_f$, then $G/N = G/\sim_f$ is "essentially the same" as $f(G)$: there is a bijective group homomorphism between them.
The Third Isomorphism Theorem corresponds to compositions of morphisms: if $f\colon G\to H$ and $g\colon H\to K$, then $f(G)/\sim_g$ is essentially the same as $g\circ f(G)/\sim_{g\circ f}$. That is, if $N\triangleleft G$, $K\triangleleft G$, $N\subseteq G$, then $K/N \triangleleft G/N$ and $(G/N)/(K/N)\cong G/K$.
The Fourth, or lattice, Isomorphism Theorem establishes a correspondence between the subgroups of $f(G)$ and the subgroups of $G$ that contain $[e]_f$. One then asks... okay, and what about other subgroups of $G$? That's what the Second Isomorphism Theorem gives you: if $f\colon G\to H$ is a homomorphism, and $K$ is an arbitrary subgroup of $G$, then $f(K)$ corresponds to $K/(K\cap N)$ (where $N=[e]_f$). And this image is "the same" as the image of $KN$. That is,
$$\frac{K}{K\cap N} \cong \frac{KN}{N}.$$
So in summary:
First Isomorphism Theorem tells you that images of groups correspond to quotients and vice-versa.
Third Isomorphism Theorem tells you that this correspondence plays well with composition.
Fourth Isomorphism Theorem tells you that there is a very nice correspondence between the subgroups of $f(G)$ and the subgroup of $G$ that contain $[e]_f$.
And the Second Isomorphism Theorem tells you how the rest of the subgroups of $G$ behave under the homomorphism $f$.
Thus, the importance of normal subgroups corresponds simply to the importance of homomorphisms. Images of a group are like "shadows" of the group, and so will hopefully sometimes be easier to understand. Simple groups are the ones that we cannot simplify this way: we'll just have to stare at them intently until we understand them. And if we can understand simple groups, and we can understand how to put groups together (group extensions) from $N$ and $G/N$, then perhaps we can leverage our (hypothetical) understanding of simple groups into a (even more hypothetical) understanding of all groups. Turns out this is too naïve a hope, unfortunately, but perhaps it can help justify why we care about morphisms, normal subgroups, quotients, etc.
Best Answer
"However, I can't wrap my head around what A/ker(f) really is, or how to visualize it in my head."
Its a quotient group $G/N = \{x+N\mid x\in G\}$ where $G$ is a group and $N$ is a normal subgroup.
For instance, take $G=\Bbb R^n$ and $U$ be a subspace. Then $G/U$ consists of the affine subspaces $x+U$ which are parallel to $U$ with "shift vector" $x\in G$.
The example $f:\Bbb Z\rightarrow\Bbb Z$ is a bit misleading, since the only homomorphisms are the zero mapping and the identity mapping.