We learn math with numbers early on. We learn how to apply operations to numbers to get new numbers. We learn rules, and consequences of those rules. All of that is pretty straightforward.
But, the real numbers are not the only things we might want to examine in detail. The properties of how elements interact under operations is a more general, abstract notion of what we do with numbers when we do algebra.
For instance, maybe we want to examine what a shape looks like if we rotate it around. Maybe you run a supply chain, and you need to build 4 widgets, but only some of those widgets need to be built in a certain order. Could you re-arrange things to make it more efficient? Maybe we want to explore structures that have a fundamental periodicity, like the time of day.
Over time, we have constructed concepts of structures that elements can belong to, and notions of operations on these structures. These structures -- groups, fields, rings, monoids, modules, vector spaces, etc. -- don't have a natural set of rules, per se. We make up those rules (aka axioms), but we have found that many natural concepts adhere to those rules.
This is all well and good but somewhat useless until you learn about isomorphism. Exploring what a group is or what a ring is is fine. But the richness of abstract algebra comes from the idea that you can use abstractions of a concept that are easy to understand to explain more complex behavior! Adding hours on a clock is like working in a cyclic group, for instance. Or manufacturing processes might be shown to be isomorphic to products of permutations of a finite group.
Abstract algebra is what happens when we want to explore consequences of rules and properties on collections of objects of any type -- hence the term "abstract!"
Saying that $f$ is a bijective function $A\to A$ is very vague and amorphous - it doesn't spell out in any detail what $f$ can look like, and what the set of all such $f$s look like. There are two main issues with thinking of graphs of polynomials functions when one thinks of functions. First, the graph of a function (on an arbitrary set) is not always the best way to think about a function; and second, not all functions have nice explicit formulas defining them (in a sense, "most" of them on an infinite set do not have any specification by symbols). Indeed the set $A$ doesn't necessarily have any algebraic structure at all, it could very easily be a barren set.
Two major ways of thinking about what a permutation looks like are inherent in the main two types of notation: one-line notation and cycle notation. The precursor to one-line notation is two-line notation. The array of values given in Timbuc's answer is an example of two-line notation: the top row is just the numbers $1$ through $n$, and the bottom row tells us where the entries in the above row go to, where they are sent by the given permutation. One-line notation compactifies this by not bothering with the top row, as it is ultimately superfluous.
In order to understand cycle notation one must first know what a cycle is. A cycle is exactly what it sounds like: when a set of numbers (not necessarily the whole set) is cycled through in some order. The notation $(a_1~a_2~\cdots~a_k)$ means $a_1\mapsto a_2$ and $a_2\mapsto a_3$ and so on up to $a_k\mapsto a_1$ (so it wraps around). Two cycles are disjoint if the sets of values they cycle are disjoint - that is, there is no value which is moved by both of them. Every permutation is a product of disjoint cycles (uniquely), and writing it as such a product constitutes what we call cycle notation.
The group of permutations of a set $A$ can be represented by a group of $n\times n$ permutation matrices, where $n=|A|$. A matrix is a permutation matrix if and only if every row and column has precisely one instance of the entry $1$ and the rest of the entries are $0$. Multiplying permutation matrices yields another permutation matrix - namely, the one you'd get if you composed the two permutations and then determined the corresponding permutation matrix.
This is a special case of a group representation. (Note that one can represent other types of algebraic objects, not just groups, and one can have the actions be more than just linear transformations on vector spaces, hence the occasional term "linear" representation.) In my philosophy, any action or representation of a group is not an intrinsic way of viewing what the group is; rather, it gives perspective on what the group can do to other things.
There is a strong colloquial understanding of what a permutation is, which suggests if you have trouble getting a feel for what a permutation intuitively is, as a student of math, you might be getting too snagged on the term "function" or otherwise overthinking things. Indeed, if you ask a lay, non-mathematician what a permutation is, they'd be able to tell you: the reordering of some arrangement, the scrambling of some sequence, the shuffling of some list, or a variant on that theme. Just imagine a shell game but with any number of shells. (The only potential issue with this lazy description is that sets aren't necessarily ordered in a sequence.)
Finally, I want to mention a generalization of permutations, braids. They are represented by (or could be defined as) so-called braid diagrams, which look like this:
$\hskip 2in$
(Source is this webpage which gives further nontechnical explanation of braids. Sometimes braids are written horizontally instead of vertically.) The binary operation with these braids is simply concatenating them, i.e. putting one on top of the other (it is not immediately obvious braids have inverses, but they do). Notice the diagrams keep track of over/under data: it depicts at each crossing which string goes over top or underneath which other string. If you lose this information, and just draw lines crossing plainly, and allow ourselves to "cancel" consecutive crossings of the same two strings, then our diagrams in fact represent permutations. We can think of the ends of the strings down below as a set, and the braid diagrams shuffle them around until they end up in some new arrangement depicted by the ends up top.
Best Answer
Informally speaking, an abelian group is a place where you can sum. It's the essence of what defines the sum of ordinary 'counting numbers'.
This allows you - after some exposure - to handle very abstract objects with the same familiarity you have for the latter.
As for any abstract definition, it is not going to make sense unless you have some examples at hand. And as for most abstract definitions, the examples came first. Many, many objects come with a binary operation that is associative and commutative (with unit and inverses) so we gave it at name.
Any "number system" $R$ (technically, I mean a ring here) has a notion of sum $+$, and $(R,+)$ is an abelian group. This includes very familiar number systems such as the integers, rational, real and complex numbers.
But is also includes for example matrices over these number systems. In general, product of matrices is known to depend on the order of the factors, but not their sum. Hence, we can sum matrices 'as if they were numbers'.
Another example is the "number system" $\mathbb{Z}_{12}$, which behaves like hours in a clock. You can think of it as the numbers from $1$ to $12$, but here e.g. $11+4 = 3$. It may be uncomfortable to work with this at first, but knowing that $+$ behaves similarly to ordinary numbers helps.
The list goes on, of course, but it becomes more abstract.
Another thing which may be useful to think about is... well, non-abelian groups. As we said before, in general for matrices $A,B$ we have $AB \neq BA$. Things get non-abelian really quickly in real life too: it is not the same to put your jacket on first and then your t-shirt than doing so in the reverse order.
Commutativity is far from a 'given', hence it is important to know when it does hold. It makes some things easier to organize. But as I said before, if you are not convinced that groups are important in the first place then it may not be clear why them being abelian is a thing to care about.