Suppose that $f\colon A\to B$ is surjective, then for every $b\in B$ the set $F_b=\{a\in A\mid f(a)=b\}$ is non-empty. Therefore, using the axiom of choice, there is some $g$ which selects an element from $F_b$, that is $g(F_b)\in F_b$.
Now show that $g$ is actually a function from $B$ into $A$, and that $g$ is injective.
(You can't avoid the axiom of choice in this proof, because in fact this statement is equivalent to the axiom of choice, and often is taken as the statement of the axiom of choice.)
I don't have a problem with their use of the phrase "using the axiom of choice" because it is not as binding as "by the axiom of choice". The word "using" suggests some details are hidden; here are those details.
To be absolutely precise, we need to introduce two operations on maps between sets.
If $f:A\to B$ is any map, define the direct image map $f_*:P(A)\to P(B)$ by $f_*(U) = \{f(a) : a \in U\}$ and define the inverse image map $f^*:P(B)\to P(A)$ by $f^*(V) = \{a\in A:f(a)\in V\}$.
With this notation, the preimage of a point $y$ under $f$, typically denoted $f^{-1}(y)$, is actually $f^*(\{y\})$.
The family $X = \{f^*(\{y\}) : y\in B\}$ is a collection of pairwise disjoint sets; none of them is empty because $f$ is surjective. By the axiom of choice (your lemma 9.2) there exists a map $$g_1:X\to\bigcup_{C\in X} C = A$$ such that $g_1(C) \in C$ for all $C$ in $X$.
Note this means $g_1$ is injective, because if $g_1(C) = g_1(D)$ then they are both in $C\cap D$, which means $C\cap D\ne\varnothing$, which means $C = D$.
Since every element of $X$ is of the form $f^*(\{y\})$ for some $y$ in $B$,
we are prompted to make the following definition.
Define $$g_2:B\to X$$
by $g_2(y) = f^*(\{y\})$.
Note that $g_2$ is also injective because if $g_2(x) = g_2(y)$ then applying $f_*$ to both sides yields $\{x\} = \{y\}$, hence $x = y$. For example,
$$f_*(g_2(x)) = \{f(a):a\in f^*(\{x\})\} = \{f(a):a\in\{a\in A:f(a)\in\{x\}\}\} = \{f(a):a\in\{a\in A:f(a) = x\}\} = \{x\}.$$
Now let $g$ be the composition of the two injections $g_1\circ g_2:B\to A$, and the rest should be clear: $g(y)$ is an element of $f^*(\{y\})$, so $f(g(y))$ is an element of $\{y\}$, which means $f(g(y)) = y$.
Best Answer
For $\boxed{\Leftarrow}$: assume $f$ is surjective. Then, for all $y\in Y$ there exists $x_y\in X$ such that $f(x_y)=y$. Define $g\colon Y\to X$ to be the function which maps each $y\in Y$ to such $x\in X$ (if there is more than one $x$, then the function $g$ maps $y$ to one of them chosen in an arbitrary way. This excludes the possibility that $g$ map $y$ to two distinct values, in which case it wouldn't be a function). It follows that $$ \forall y\in Y,\quad f\circ g(y) = f( g(y) ) = f(y_x) = y $$ and $f\circ g=id_Y$.
For $\boxed{\Rightarrow}$: assume $f\colon X\to Y,g\colon Y\to X$ are such that $f\circ g=id_Y$. Then for each $y\in Y$, $x_y\stackrel{\rm def}{=} g(y)\in X$ is a preimage of $y$ by $f$, as $f(x_y)=f\circ g(y) = id_Y(y)=y$. Hence $f$ is surjective.