The problem is that you also do a "meta" induction when proving choice for a finite number of sets and this "meta" induction fails in the transfinite or countable case.
To be more clear, the proof of choice for a number of sets sets usually goes something like this (I'm assuming choice is "the product of nonempty sets is nonempty").
Base case is vacuously true (you're given one nonempty set and want to prove it's nonempty).
Next, assume inductively that for any nonempty sets $X_1,...,X_n$, the product $X_1\times...\times X_n$ is nonempty. Now, given $n+1$ sets, $X_1,...,X_{n+1}$, you want to show their product is nonempty. By the induction hypothesis, there is an element $y\in X_1\times....\times X_n$. Also, there is an element $x_{n+1}\in X_{n+1}$ by assumption. Then $\{\{y,x_{n+1}\}, x_{n+1}\} = (y, x_{n+1})$ is an element of the product.
On the meta level, what you're saying by unraveling the induction is "I have $n+1$ sets $X_1,...,X_{n+1}$, and I know there is an element (which is itself a set because we're doing set theory!) in each $X_i$ which I'll call $x_i$. By using the axiom of pairing once, I can form the set $\{x_1,x_2\}$. By using it again, I can form the set $\{\{x_1,x_2\}, x_1\}$, which is the definition of $(x_1,x_2)$. Using the axiom of pairing two more times, I can form $(x_1, x_2, x_3)$. In general (i.e., by induction), by using the axiom of pairing $2(n+1)$ times, I can form the set $(x_1,x_2,...,x_{n+1})$. This set (element) is a member of $X_1\times...\times X_{n+1}$, and hence this product is nonempty!"
What goes wrong as soon as you try to adapt this to the countable or larger setting is on the meta level. Essentially, you'd have to apply the pairing axiom countably (or more) times to create the set $(x_1,...)$ which "should" be in $X_1\times...\times $. But naive set theory has taught us that just because we think something "should" be a set, doesn't mean it should (see, for example, Russel's Paradox). Thus, we at least cannot trust this proof to work in countable or larger settings. Of course, the failure of this approach doesn't mean there is NO way of proving choice from $ZF$, but it's known from other methods that you cannot prove choice from $ZF$ alone.
We can avoid the appeal to Hartogs numbers if we do an indirect proof instead. As a bonus, the transfinite recursion becomes a bit cleaner:
Assume that (a) $(P,\preceq)$ is a partially ordered set that satisfies the premises of Zorn's lemma, (b) $P$ has no maximal element, and (c) $C$ is a choice function on $P$. We then seek a contradiction.
Without loss of generality extend $C$ such that $C(\varnothing)=42$, no matter whether or not $42\in P$.
Apply Transfinite Recursion I to the class function
$$ G(X) = C(\{y\in P\mid \forall z\in \operatorname{Rng}(X)\cap P: z\prec y\}) $$
This gives a class function $F$ such that for every ordinal $\alpha$,
$$ F(\alpha) = C(\{y\in P \mid \forall \beta<\alpha : F(\beta)\in P \to F(\beta)\prec y\}) $$
Lemma. For all $\alpha$ it holds that $F(\alpha)\in P$ and $\forall \beta : F(\beta)\prec F(\alpha)$.
Proof. By transfinite induction on $\alpha$. The induction hypothesis tells us that $\{F(\beta)\mid \beta<\alpha\}$ is a chain in $P$ (note that it is a set in any case thanks to Replacement). The Zorn premises tells us that this chain has an upper bound; because there is no maximal element it even has a strict upper bound. In other words, $$\{y\in P \mid \forall \beta<\alpha : F(\beta)\in P \to F(\beta)\prec y\},$$ the set of all the strict upper bounds, is not empty, so $F(\alpha)$ is in $P$ and is greater than all the $F(\beta)$s.
The lemma tells us that $F$ is an order-preserving function from ON to $P$. Since $F$ preserves order, it is in particular injective. Therefore the formula
$$ H(x)=\alpha \iff F(\alpha)=x \lor (\alpha=117 \land \forall\beta:F(\beta)\ne x) $$
is functional, and applying Replacement on $P$ and $H$ tells us that ON is a set. The Burali-Forti paradox now furnishes our desired contradiction.
Note that we never actually needed $C(\varnothing)=42$ except to make sure that $G$ was a class function so we could use the recursion theorem on it.
The proof structure here is typical: The definition by transfinite recursion is followed immediately by a transfinite induction that extracts useful facts from the recursion formula. This two-step approach is often necessary for using the general recursion machinery because we need to define $G$ in a defensive way such that we can prove that it is makes sense and is functional without depending on its argument being produced by recursion. This defensive scaffolding -- i.e., intersecting $\operatorname{Rng}(X)$ with $P$ so $z\prec y$ makes sense, and making sure that $C(\{x\in P\mid \cdots\})$ always means something even if the condition ends up filtering everything away -- is then cleaned away by the lemma once the recursive magic has done its job and we know the input to $G$ comes from a well-behaved recursion.
As an additional note, one might argue that this is not "morally" an indirect proof. The way I prefer to think about it is that I would like to assume just (a) and (c), and then in the middle of the proof of the lemma say
... this chain has an upper bound. If that upper bound is a maximal element, then we're done; otherwise there's something larger, which is therefore a strict upper bound ...
Claiming that "we're done" with the entire Zorn's Lemma when we were in the middle of an inner induction step is not really done in polite company, though. Phrasing the entire proof as an indirect proof will allow us to follow this intuition anyway, in simple yet formally acceptable way.
The part of the proof after the lemma is then just an argument that eventually we must hit one of the "we're done" conditions, because continuing forever would have absurd consequences.
Best Answer
The axiom of choice by itself is useless. It just gives you a function. What can you do with it? Not that much, in most cases.
On the other hand, transfinite recursion is something that $\sf ZF$ itself can do pretty well already. And remember that the key point in transfinite recursion is that there is a function that tells you how to proceed to the $\alpha$ stage, assuming that you've done all the stages before $\alpha$.
Well, a choice function is a function. And it interfaces with transfinite recursion very well. Indeed, Zorn's Lemma and other similar equivalents were introduced to alleviate the need for transfinite recursion and induction. But any proof with Zorn's Lemma can be transformed into a proof using a choice function and transfinite recursion, including this one.
Fix a choice function from all the non-empty subsets of $V$. Now recursively add more and more vectors to your set, with the recursive rule being that if we have already gathered $\{v_i\mid i<\alpha\}$, then $v_\alpha$ is chosen to be any vector in $V\setminus\operatorname{span}(\{v_i\mid i<\alpha\})$. The only way we cannot make such choice is when what we gathered already is a basis; and so eventually the recursion must come to a halt, otherwise we are contradicting Hartogs' theorem, as it defines an injection from the class of ordinals into $V$, which is a set.
Now, we can, and should, also prove that this set of vectors is linearly independent. We can do that by induction; where we simply note that since $v_\alpha\notin\operatorname{span}(\{v_i\mid i<\alpha\})$, then it is not a linear combination of any finitely many $v_i$s for $i<\alpha$. Now, since the induction hypothesis is that those were already linearly independent, this completes the proof.