Here is a histogram for length 100. It seems to be normally distributed around $100^2 / 3$, which should put you on the scent, in spite of a complete absence of proof (see edit below).
This is not at all surprising, since if $x$ and $y$ are drawn uniformly at random from the interval $[0,1]$, the expected value of $|x-y|$ is
$$
\int_{x=0}^1 \left(\frac{x^2}{2} + \frac{(1-x)^2}{2}\right) = \frac 13.
$$
Maybe turning this fact into a proof is very straightforward; maybe it is tricky.
Edit: I should note that the expected value of $(n^2-1)/3$ actually is very easy to prove; it is just the distribution that might be tricky.
One way to generate a random permutation $\pi$ is to choose $n$ reals $r(i)$ uniformly at random from $[0,1]$. Now, what is the expected value of $|\pi(i) - \pi(i+1)|$? It is simply the expected number of $r(j)$ between $r(i)$ and $r(i+1)$, plus one. Because the expected difference between $r(i)$ and $r(i+1)$ is $1/3$, this is $1+(n-2)/3$, or $(n+1)/3$. Since we compute this for $n-1$ consecutive pairs, linearity of expectation tells us that the expected sum is $(n-1)(n+1)/3$, or $(n^2-1)/3$.
The surreal numbers exhibit much stronger universal properties
than you have
mentioned, for they also exhibit very strong homogeneity and saturation
properties. For example, every automorphism of a set-sized elementary substructure of
the surreals extends to an automorphism of the entire surreal
numbers, and every set-sized type over the surreals that is consistent with the
theory of the surreal numbers is realized in the surreal numbers. That is, any first-order property that could be true about an object as it relates to some surreal numbers, which is consistent with the theory of the surreal numbers, is already true about some surreal number.
For any complete first-order theory $T$, one can consider the
concept of a monster model
of the theory. This is a model $\mathcal{M}$ of $T$, such that
first, every other set-sized model of $T$ embeds as an elementary
substructure of $\mathcal{M}$--not merely as a substructure, but
as a substructure in which the truth of any first-order statement
has the same truth value in the substructure as in big model--and
second, such that every automorphism of a set-sized substructure
of $\mathcal{M}$ extends to an automorphism of $\mathcal{M}$.
One may also use approximations to these proper class monster
models by considering extremely large set-sized models, of some
size $\kappa$, such that the embedding and homogeneity properties
hold with respect to substructures of size less than $\kappa$.
Every consistent complete first order theory $T$ has such monster
models, and they are used pervasively in model theory. Model
theorists find it convenient, when considering types over and
extensions of a fixed model, to work inside a fixed monster model,
considering only extensions that arise as submodels of the fixed
monster model.
Your topic also has a strong affinity with the concept of the
Fraïssé limit of a collection
of finitely-generated (or $\kappa$-generated) structures, which one aims to
be the age of the limit structure, the collection of
finitely-generated ($\kappa$-generated) substructures of the limit
structure. Fraïssé limits are often built to exhibit the same
saturation and homogeneity properties of the surreal numbers. As a
linear order, the surreal numbers are the Fraïssé limit of the
collection of all set-sized linear orders. And I believe that one
can also put the field structure in here.
Update. When one has the global axiom of choice, the set-homogeneous property of the generalized Fraïssé limit allows one to establish universality for proper class structures. Basically, using global AC one realizes a given proper class structure as a union of a tower of set structures, and gradually maps them into the homogeneous structure. The homogeneity property is exactly what you need to keep extending the embedding, and so one gets the whole proper class structure mapping in. This kind of argument, I believe, shows that what you want to consider is homogeneity rather than merely the universal property itself. (This argument is an analogue of the idea that when one has homogeneity for countable substructures, then one gets universality for structures of size $\aleph_1$.)
Regarding your specific construction, here is a simpler way to undertake the same idea, which avoids the need for the equivalence relation: Let $G$ be the proper class of all fixed-point-free
permutations of a set. This class supports a natural group
operation, which is to compose them, regarding elements outside
the domain as fixed by the permutation, and then cast out any newly-created fixed points. The identity element of $G$ is
the empty function, which is really a stand-in for the identity
function on the universal class.
It is clear that every group finds an isomorphic copy inside $G$,
without using the axiom of choice, since every group is naturally
isomorphic to a group of permutations, and these are naturally
embedded into $G$, simply by casting out fixed-points. This does not use the axiom of choice.
The class group $G$ is a natural presentation of the set-support symmetric
group $\text{Sym}_{\text{set}}(V)$ of the set-theoretic universe
$V$, the class of all permutations of $V$ having set support. Any such permutation of $V$ is represented in $G$ by restricting to the non-fixed-points.
Note finally that in the case of the surreal numbers, one doesn't need the axiom of
choice in order to construct the surreal numbers, and one can get
many universal properties for well-orderable set structures and in
general from the axiom of choice for all set-sized structures. But
to get the universal property that you mention for class-sized structures, mere AC
is not enough, for one needs the global axiom of choice, which is
the assertion that there is a proper class well-ordering of the
universe. This does not follow from AC, for there are models of
Gödel-Bernays set theory GB that have AC, but not global
choice. But global AC is sufficient to carry out the embedding of any class linear order into No. A similar phenomenon arises with many other class-sized class-homogeneous set-saturated models, which require global AC to get the universality for class structures.
Best Answer
The first thing to notice is that infinite permutations may have infinite support, that is, they may move infinitely many elements. Therefore, we cannot expect to express them as finite compositions of permutations having only finite support.
But if we allow (well-defined) infinite compositions, then the answer is that every permutation can be expressed as a composition of disjoint cycles and also expressed as a composition of transpositions. So the answer to question 1 is yes, and the answer to question 2 is no.
To see this, suppose that f is a permutation of ω. First, we may divide f into its disjoint orbits, where the orbit of n is defined as all the numbers of the form fk(n) for any integer k. The action of f on each of these orbits commute with each other, because the orbits are disjoint. And the action of f on each such orbit is a cycle (possibly infinite). So f can be represented as a product of disjoint cycles. For the transposition representation, it suffices to represent each such orbit as a suitable product of transpositions. The finite orbits are just finite cycles, which can be expressed as a product of transpositions in the usual way. An infinite orbit looks exactly like a copy of the integers, with the shift map. This can be represented in cycle notation as (... -2 -1 0 1 2 ...). This permutation is equal to the following product of transpositions:
I claim that every natural number is moved by at most two of these transpositions, and that the resulting product is well-defined. On the right hand side of the equality, I have two infinite products of transpositions. Using the usual order of product of permutations, the right-most factor is first to be applied. Thus, we see that 0 gets sent to 1, and subsequently fixed by all later transpositions. So the product sends 0 to 1. Similarly, 1 gets sent to 0 and then to 2, and then unchanged. Similarly, it is easy to see that every non-negative integer n is sent to 0 and then to n+1 as desired. Now, the right-hand factor fixes all negative integers, which then pass to the left factor, and it is easy to see that again -n is sent to 0 and then to -n+1, as desired. So altogether, this product is operating correctly. An isomorphic version of this idea can be used to represent the action of any infinite orbit, and so every permutation is a suitable well-defined product of transpositions, as desired.
Thus, the answers to the questions in (1) are yes, and the answer to question (2) is no.