So for the first question, one answer could be
They don't have to have anything in common, although a couple of them do. They don't all stem from a single source, and their uses depend on which field you are in.
Generally
In just plain English, the phrase "is characteristic of" means "is a distinguishing feature of." By extension in mathematics, it could be used this way to describe something that completely describes another thing. The generator of a cyclic group is a good example of this, since you can recover the entire group from a single generator.
More generally, mathematicians like to talk about "what characterizes" a certain thing. So, for example, the Artin-Wedderburn theorem is a characterization of the semisimple rings as finite products of matrix rings over division rings. The Ore condition is what characterizes which domains can be densely embedded in division rings.
Special uses
Characteristic subgroups in group theory are very special normal groups. Essentially, they are unmoved by automorphisms of the group. This makes them even more special than normal subgroups.
The characteristic polynomial doesn't describe the matrix, but it does characterize information about the transformation that the matrix represents. Really the particular matrix representation is secondary to the transformation.
Sometimes eigenvectors and eigenvalues are referred to as "characteristic" rather than "eigen." (This also suggests somewhat that "characteristic" is one of those overloaded math terms like "regular" or "normal" that pops up every time someone invents something new.)
The characteristic of a ring is important, but I don't know if I can tell you a single reason why. First of all, there are a lot of theorems that are first proven for characteristic zero. (I'm thinking in particular of some results in group rings.) This is usually because the very basic rings ($\Bbb Z,\Bbb R,\Bbb Q,\Bbb C$) are all characteristic zero, and maybe our intuition is better with them. Ordered fields, which are in some way bound up with our intuition of geometric length, are all characteristic zero too.
When the characteristic is nonzero, things are harder because you have to cope with a kind of (very interesting!) degeneracy. The characteristic 2 case seems like the "least nice" because of its appearance in some geometrically freak cases. Even though I've said "degeneracy" and "freak" now to describe these things, I still want to stress that they are interesting and important. Positive characteristic fields are the natural environment for algebraic coding theory, after all :)
The first time I encountered characteristic functions was in measure theory, where the characteristic function of a set is one which has value $1$ for elements of the set, and value $0$ elsewhere. Perhaps the motivation for calling it "characteristic" is that it clearly distinguishes which points are inside the set and which points are outside the set.
Another use of "character" is the one from representation theory, where you talk about the character afforded by a representation. I'm not aware of the true origins of the term, but I've always thought of it like this. The character of a representation is a distillation of some of the information carried by the representation. The information is distilled into a function from the group into a field. There, you can divine several important characteristics of the group. You might have also derived them directly from the representation, but the character might make things easier.
There is also something I know nothing about called the Euler characteristic which is a topological invariant. "Invariant" could be considered semantically close to "characteristic." They are both often used to describe qualities that are intrinsic to the object.
Consider the element $1+I$ in $R/I$. Is it clear that this is the multiplicative identity in $R/I\,$? Then to say something about the characteristic, we need to look at $$\sum_1^n (1+I)$$
$$= \left(\sum_1^n 1\right)+I\quad\text{(why?)}$$
But we know that this is the zero element of $R/I$ if $n = 8$ since the characteristic of $R$ is 8. This tells us two things: First, the quotient ring has positive characteristic and second, it is at most 8.
Also, the characteristic of the ring and the order are not related as you suggest since you can have polynomial rings like $\mathbb{Z}_8[x]$; this ring has characteristic 8 and is infinite.
Best Answer
There are two orderings of the set $\mathbb N = \{0,1,\dots\}$:
They are mostly compatible - usually when $a \mid b$, it holds $a \leq b$.
Some definitions are phrased using "greater than" ordering, while in fact the "divisibility" ordering is the real essence.
For example, the greatest common divisor of $a$ and $b$ might be defined as the greatest number which is a common divisor of both $a$ and $b$. Characteristic of a ring $R$ might be defined as smallest number $n>0$ which satisfies $n \cdot 1 = 0$.
Under such commonly taught definitions, it seems natural that $\operatorname{gcd}(0,0)=\infty$ and $\operatorname{char} \mathbb Z = \infty$.
However, those definitions implicitly rely on ideals, and are better phrased using divisibility order. The incompatibility is then more visible: $0$ is the largest element in divisibility order, while it is smallest in magnitude order. Magnitude has no largest element, and often $\infty$ is added to cover this case.
So let's formulate the definitions again, but this time using divisibility ordering.
Characteristic is a "multiplicative" notion, like gcd. If you have a homomorphism of rings $f: A \to B$, it must hold $\operatorname{char} B \mid \operatorname{char} A$. For example, you cannot map ${\mathbb Z}_2$ to ${\mathbb Z}_4$ - in a sense, ${\mathbb Z}_2$ is "smaller" than ${\mathbb Z}_4$. "Bigger" rings have "more divisible" characteristic, their characteristics are greater in the sense of divisibility. And the "most divisible" number is 0. Another example is $\operatorname{char} A \times B = \operatorname{lcm}(\operatorname{char} A, \operatorname{char} B)$.
In a bit more abstract language: given any ideal $I \subseteq \mathbb Z$, we associate to it the smallest nonnegative element, under the divisibility order. By properties of $\mathbb Z$, every other element of $I$ is a multiple of it. Let's call this number $\operatorname{min}(I)$.
We can now define $\operatorname{gcd}(a,b)=\operatorname{min} ((a) + (b))$, and $\operatorname{char} R = \min (\ker f)$, where $f \colon \mathbb Z \to R$ is the canonical map.
The definition of $\operatorname{min}(I)$ works for any PID, it does not require magnitude order. In any PID, $I = (\operatorname{min}(I))$.
(I dislike saying the ideal $\{0\}$ is "generated" by $0$; although this is true, it also generated by empty set. We do not say that $(2)$ is generated by $0$ and $2$.)